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Autosomal dominant polycystic kidney disease (ADPKD) is a common genetic disorder which frequently results in renal failure, due 
to progressive cyst development. The major locus, PKD1, maps to 16pl33. A chromosome translocation is identified associated with 
ADPKD which disrupts a gene (PBP), encoding a 14 kb transcript, in the PKD1 candidate region. Further mutations of the PBP gene 
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treated or prevented by PKD1 gene therapy and/or administration of functional PKD1 protein to affected cells. 
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POLYCYSTIC KIDNEY DISEASE 1 GENE AND USES THEREOF 



The present invention relates to the polycystic 
kidney disease 1 (PKD1) gene, mutations thereof in 
patients having PKDl-associated disorders, the protein 
encoded by the PKD1 gene, and their uses in diagnosis 

5 and therapy. 

Background to the Invention 

All references mentioned herebelow are listed in 
full at the end of the description which are herein 
incorporated by reference in their entirety. Except 

10 where the context clearly indicates otherwise, 
references to the PBP gene, transcript, sequence, 
protein or the like can be read as referring to the 
PKD1 gene, transcript, sequence, protein or the like, 
respectively . 

15 A landmark study by Dalgaard, 1957 showed that 

autosomal dominant polycystic kidney disease (ADPKD) 
also termed adult polycystic kidney disease (APKD) is 
one of the commonest genetic diseases of man 
(approximately 1/1000 individuals affected). The major 

20 feature of this dominant disease is the development of 
cystic kidneys which commonly leads to renal failure in 
adult life. This simple description, however, belies 
the diverse systemic disorder, affecting many other 
organs (reviewed in Gabow, 1990) and one which 

25 occasionally presents in childhood (Fink, et al., 1993; 
Zerres, et al., 1993). Extrarenal manifestations 
include liver cysts (Milutinovic , et al . , 1980), and 
more rarely cysts of the pancreas (Gabow, 1993) and 
other organs. Intracranial aneurysms occur in 

30 approximately 5% of patients and are a significant 
cause of morbidity and mortality due to subarachnoid 
haemorrhage (Chapman, et al., 1992). More recently, an 
increased prevalence of cardiac valve defects (Hossack, 
et al., 1988 ) , herniae (Gabow, 1990 ) and colonic 
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diverticulae (Scheff, et al., 1980) has been reported. 

The major cause of morbidity in ADPKD, however, is 
progressive renal disease characterised by the 
formation and enlargement of fluid filled cysts, 
resulting in grossly enlarged kidneys. Renal function 
deteriorates as normal tissue is compromised by cystic 
growth, resulting in end stage renal disease (ESRD) in 
more than 50% of patients by the age of 60 years 
(Gabow, et al., 1992): ADPKD accounts for 8-10% of all 
renal transplantation and dialysis patients in Europe 
and the USA (Gabow, 1993). Biochemical studies have 
suggested several potential causes of cyst formation 
and development, including: abnormal epithelial cell 
growth, alterations to the extracellular matrix and 
15 changes in cellular polarity and secretion (reviewed in 
Gabow, 1991; Wilson and Sherwood, 1991). The primary 
defect in ADPKD, however, remains unclear and 
considerable effort has therefore been applied to 
identifying the defective gene(s) in this disorder by 
2 0 genetic approaches. 

The first step towards positional cloning of an 
ADPKD gene was the demonstration of linkage of one 
locus now designated the polycystic kidney disease 1 
(PKD1) locus to the a globin cluster on the short arm 
25 of chromosome 16 (Reeders, et al., 1985 ). 
Subsequently, families with ADPKD unlinked to markers 
of 16p were described (Kimberling, et al., 1988; 
Romeo, et al . , 1988) and a second ADPKD locus (PKD2) 
has recently been assigned to chromosome region 4ql3- 
30 q23 (Kimberling, et al., 1993; Peters, et al., 1993). 
It is estimated that approximately 85% of ADPKD is due 
to PKD1 (Peters and Sandkuijl, 1992) with PKD2 
accounting for most of the remainder. PKD2 appears to 
be a milder condition with a later age of onset and 
35 ESRD (Parfrey^et,..al., 1990; Gabow, et al., 1992; 
Ravine, et al . , 1992). 
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The position of the PKD1 locus was refined to 
chromosome band 16pl3.3 and many markers were isolated 
from that region (Breuning, et al., 1987; Reeders , et 
al., 1988; Breuning, et al. , 1990; Gerraino, et al., 

5 1990; Hyland, et al., 1990; Himmelbauer, et al. , 
1991). Their order, and the position of the PKD1 
locus, has been determined by extensive linkage 
analysis in normal and PKD1 families and by the use of 
a panel of somatic cell hybrids (Reeders, et al., 1988; 

10 Breuning, et al., 1990; Germino, et al . , 1990). An 
accurate long range restriction map (Harris, et al . , 
1990; Germino, et al., 1992) has located the PKD1 
locus in an interval of approximately 600 kb between 
the markers GGG1 and SM7 (Harris, et al., 1991; 

15 Somlo, et al., 1992) (see Figure la). The density of 
CpG islands and identification of many mRNA transcripts 
indicated that this area is rich in gene sequences. 
Germino et al (1992) estimated that the candidate 
region contains approximately 20 genes. 

20 Identification of the PKD1 gene from within this 

area has thus proved difficult and other means to 
pinpoint the disease gene were sought. Linkage 
disequilibrium has been demonstrated between PKD1 and 
the proximal marker VK5 , in a Scottish population 

25 (Pound, et al., 1992) and between PKD1 and BLu24 (see 
Figure la), in a Spanish population (Peral, et al., 
1994). Studies with additional markers have shown 
evidence of a common ancestor in a proportion of each 
population (Peral, et al., 1994; Snarey, et al., 

30 1994), but the association has not precisely positioned 
the PKD1 locus. 

Disease associated genomic rearrangements, 
detected by cytogenetics or pulsed field gel 
electrophoresis (PFGE) have been instrumental in the 

35 identification of various genes associated with various 
genetic disorders. Hitherto, no such abnormalities 
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related to PKD1 have been described. This situation 
contrasts with that for the tuberous sclerosis locus, 
which lies within 16pl3.3 (TSC2). In that case, TSC 
associated deletions were detected by PFGE within the 
interval thought to contain the PKD1 . gene and their 
characterisation was a significant step toward the 
rapid identification of the TSC2 gene (European 
Chromosome 16 Tuberous Sclerosis Consortium, 1993). 
The TSC2 gene therefore maps within the candidate 
region for the hitherto unidentified PKD1 gene; as 
polycystic kidneys are a feature common to TSC and 
ADPKD1 (Bernstein and Robbins, 1991) the possibility of 
an aetiological link, as proposed by Kandt et al . 
(1992), was considered. 
15 We have now identified a pedigree in which the two 

distinct phenotypes, typical ADPKD or TSC, are seen in 
different members. in this family, the two individuals 
with ADPKD are carriers of a balanced chromosome 
translocation with a breakpoint within 16pl3.3. We 
have located the chromosome 16 translocation breakpoint 
and a gene disrupted by this rearrangement has been 
defined; the discovery of additional mutations of that 
gene in other PKD1 patients shows that we have 
identified the PKD1 gene. 
25 Summary of the Invention 

Accordingly, in one aspect, this invention 
provides an isolated, purified or recombinant nucleic 
acid sequence comprising: - 

(a) a PKD1 gene or its complementary strand, 

30 < b ) a sequence substantially homologous to, or 

capable of hybridising to, a substantial portion of a 
molecule defined in (a) above, 

(c) a fragment of a molecule defined in (a) or 

(b) above. In particular, there is provided a sequence 
wherein the PKD1 gene has the partial nucleic acid 
sequence according to Figure 7 and/or 10. Th*» 



20 



35 
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invention therefore includes a DNA molecule selected 
from: 

(a) a PKD1 gene or its complementary strand, 

(b) a sequence substantially homologous to, or 
5 capable of hybridising to, a substantial portion of a 

molecule defined in (a) above, 

(c) a molecule coding for a polypeptide having 
the partial sequence of Figure 7, 

(d) genomic DNA corresponding to a molecule in 
10 (a) above; and 

(e) a fragment of a molecule defined in any of 
(a), (b), (c) or (d) above. 

The PKD1 gene described herein is a gene found on 
human chromosone 16, and the results of familial 

15 studies described herein form the basis for concluding 
that this PKD1 gene encodes a protein called PKD1 
protein which has a role in the prevention or 
suppression of ADPKD. The PKD1 gene therefore includes 
the DNA sequences shown in Figures 7 and 10, and all 

20 functional equivalents. The gene furthermore includes 
regulatory regions which control the expression of the 
PKD1 coding sequence, including promotor, enhancer and 
terminator regions. Other DNA sequences such as 
introns spliced from the end-product PKD1 RNA 

25 transcript are also encompassed. Although work has 
been carried out in relation to the human gene, the 
corresponding genetic and functional sequences present 
in lower animals are also encompassed. 

The present invention therefore further provides a 

30 PKD1 gene or its complementary strand having the 
partial sequence according to Figure 7. In particular, 
it provides a PKD1 gene or its complementary strand 
having the partial sequence of Figures 7 and/or 10 
which gene or strand is mutated in some ADPKD patients 

35 (more specifically, PKD1 patients). 
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The invention further provides a nucleic acid seq u e n ce oonprising a 
mutant PKD1 gene, especially one selected frcm a sequence comprising a 
partial sequence according to Figures 7 and/or 10 when: 

(a) [OX114] base pairs 1746-2192 as defined in Figure 7 are deleted 
5 (446bp); 

(b) [OX32] base pairs 3696-3831 as defined in Figure 7 are deleted 
by a splicing defect; 

(c) [0X875] about 5.5kb flanked by the two Xbal sites shown in 
Figure 3a are deleted and the EodRl site separating the CW10 (41kb) and JH1 

10 (18kb) sites is thereby absent 

(d) [WS53] about lOOkb extending between the JH1 and CW21 and the 
SM6 and JH17 sites shown in Figure 6 and tte FKD1 gene is thereby absent, 
the deletion lying proximally between SM6 and JH13; 

(e) [461] 18bp are deleted in the 75bp intron amplified by the 
15 primer pair 3A3C insert at position 3696 of the 3' sequence as shown in 

Figure 11; 

(f ) [OX1054] 20bp are deleted in the 75bp intron amplified by the 
primer pair 3A3C insert at position 3696 of the 3 f sequence as shown in 
Figure 11; 

20 ( g ) [ WS212] about 75kb are deleted between SM9-CW9 distally and the 

PKD1 3 *UTR proximally as shown in Figure 12; 

(h) [WS-215] about 160kb are deleted between CW20 and SM6-JH17 as shown 
in Figure 12; 

(i) [WS-227] about 50kb are deleted between CW20 and JH11 as shown in 
25 Figure 12; 

( j ) [WS-219] about 27kb are deleted between JH1 and JH6 as shown in 
Figure 12; 

(k) [WS-250] about 160kb are deleted between CW20 and BLu24 as shown in 
Figure 12; 

30 (1) [WS-194] about 65kb is deleted between CW20 and CW10. 

The invention therefore extends to RNA molecules comprising an RNA 
sequence corresponding to any of the DNA sequences set out above* Tne 
molecule is preferably the transcript reference FBP and 

SUBSTITUTE SHEET (RULE 26) 
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identifiable from the restriction map of Figure 3a and 
having a sequence of about 14 Kb. 

In another aspect, the invention provides a 
nucleic acid probe having a sequence as set out above; 

5 in particular, this invention extends to a purified 
nucleic acid probe which hybridises to at least a 
portion of the DNA or RNA molecule of any of the 
preceding sequences. Preferably, the probe includes a 
label such as a radiolabel for example a 32 P label. 

10 In another aspect, this invention provides a 

purified DNA or RNA coding for a protein comprising the 
amino acid sequence of Figure 7 and/or 10 , or a protein 
polypeptide having homologous properties with said 
protein, or having at least one functional domain or 

15 active site in common with said protein. 

The DNA molecule defined above may be incorporated 
in a recombinant cloning vector for expressing a 
protein having the amino acid sequence of Figure 7 
and/or 10, or a protein or a polypeptide having at 

20 least one functional domain or active site in common 
with said protein. 

In another aspect, the invention provides a 
polypeptide encoded by a sequence as set out above, or 
having the amino acid sequence according to the partial 

25 amino acid sequence of Figure 7 and/or 10, or a protein 
or polypeptide having homologous properties with said 
protein, or having at least one functional domain or 
active site in common with said protein. In 
particular, there is provided an isolated, purified or 

30 recombinant polypeptide comprising a PKD1 protein or a 
mutant or variant thereof or encoded by a sequence set 
out above or a variant thereof having substantially the 
same activity as the PKD1 protein. 

This invention also provides an in vitro method of 

35 determining whether an individual is likely to be 
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affected with tuberous sclerosis, comprising the steps 
of: 

assaying a sample from the individual to determine 
the presence and/or amount of PKD1 protein or 
polypeptide having the amino acid sequence of Figure 7 
and/or 10. 

Additionally or alternatively, a sample may be 
assayed to determine the presence and/or amount of raRNA 
coding for the protein or polypeptide having the amino 
acid seguence of Figure 7 and/or 10, or to determine 
the fragment lengths of fragments of nucleotide 
seguences coding for the protein or polypeptide of 
Figure 7 and/or 10, or to detect inactivating mutations 
in DNA coding for a protein having the amino acid 
15 sequence of Figure 7 and/or 10 or a protein having 
homologous properties. Said screening preferably 
includes applying a nucleic acid amplification process 
to said sample to amplify a fragment of the DNA 
sequence. Said nucleic acid amplification process 
advantagously utilizes at least one of the following 
sets of primers as identified herein :- 



10 



20 



AH 3 F9 : AH 3 B7 
3A3 CI : 3A3 C2 
25 AH 4 F2 : JH14 B3 

Alternatively, said screening method may comprise 
digesting said sample to provide EcoRI fragments and 
hybridising with a DNA probe which hybridises to the 
30 EcoRI fragment identified (A) in Figure 3(a), and said 
DNA probe may comprise the DNA probe CW10 identified 
herein. 

Another screening method may comprise digesting 
said sample to provide BamHI fragments and hybridising 
3 5 with a DNA probe which hybridises to the BamHI fragment 
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identified (B) in Figure 3 (a), and said DNA probe may 
comprise the DNA probe 1A1H.6 identified herein. 

A method according to the present invention may 
comprise detecting a PKD1 -associated disorder in a 

5 patient suspected of having or having predisposition 
to, said disorder, the method comprising detecting the 
presence of and/or evaluating the characteristics of 
PKD1 DNA, PKD1 mRNA and/or PKD1 protein in a sample 
taken from the patient. Such method may comprise 

10 detecting and/or evaluating whether the PKD1 DNA is 
deleted, missing, mutated, aberrant or not expressing 
normal PKD1 protein. One way of carrying out such a 
method comprises : 

A. taking a biological, tissue or biopsy 
15 sample from the patient; 

B. detecting the presence of and/or evaluating 
the characteristics of PKD1 DNA, PKD1 mRNA and/or PKD1 
protein in the sample to obtain a first set of results; 

C. comparing the first set of results with a 
20 second set of results obtained using the same or 

similar methodology for an individual not suspected of 
having said disorders; and if the first and second sets 
of results differ in that the PKD1 DNA is deleted, 
missing, aberrant, mutated or not expressing PKD1 

25 protein then that indicates the presence, 
predisposition or tendency of the patient to develop 
said disorders . 

A specific method according to the invention 
comprises extracting a sample of PKD1 DNA or DNA from 

30 the PKD1 locus purporting to be PKD1 DNA from a 
patient, cultivating the sample in vitro and analysing 
the resulting protein, and comparing the resulting 
protein with normal PKD1 protein according to the well- 
established Protein Truncation Test. 

35 Less sensitive tests include analysis of RNA using 

RT PCR (reverse transcriptase polymerase chain 
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reaction) and examination of genomic DNA. 

On the other hand, if step C of the method is 
replaced by: 

C. comparing the first set of results with a 

second set of results obtained using the same or 
similar methodology in an individual known to have the 
or at least one of said disorder(s); and if the first 
and second sets of results are substantially identical, 
this indicates that the PKD1 DNA in the patient is 
deleted, mutated or not expressing normal PKD1 protein. 

The invention further provides a method of 
characterising a mutation in a subject suspected of 
having a mutation in the PKD1 gene, which method 
comprises : 

15 A - amplifying each of the exons in the PKD1 

gene of the subject; 

B. denaturing the complementary strands of the 

amplified exons; 

c - diluting the denatured separate, 

complementary strands to allow each single-stranded DNA 
molecule to assume a secondary structural conformation; 

D - subjecting the DNA molecule to 

electrophoresis under non-denaturing conditions; 

E. comparing the electrophoresis pattern of 
25 the single-stranded molecule with the electrophoresis 

pattern of a single-stranded molecule containing the 
same amplified exon from a control individual which has 
either a normal or PKD1 heterozygous genotype; and 

F. seguencing any amplification product which 
has an electrophoretic pattern different from the 
pattern obtained from the DNA of the control 
individual . 

The invention also extends to a diagnostic kit for 
carrying out a method as set out above, comprising 
nucleic acid primers for amplifying a fragment of the 
DNA or RNA sequences defined above- The nucleic a'cl'd 



20 



30 



35 
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primers may comprise at least one of the following 
sets : 

AH 3 F9 : AH 3 B7 
5 3A3 CI : 3A3 C2 

AH4 F2 : JH14 B3 

Another embodiment of kit may combine one or more 
substances for digesting a sample to provide EcoRI 

10 fragments and a DNA probe as previously defined* 

A further embodiment of kit may comprise one or 
more substances for digesting a sample to provide BamHI 
fragments and a DNA probe as previously defined . 

Still further, a kit may include a nucleic acid 

15 probe capable of hybridising to the DNA or RNA molecule 
previously defined. 

A vector (such as Bluscript (available from 
Stratagene) ) comprising a nucleic acid sequence set out 
above; and a host cell (such as E. coli strain SL-1 

20 Blue (available from Stratagene)) transfected or 
transformed with the vector are also provided, together 
with the use of such a vector or a nucleic acid 
sequence set out above in gene therapy and/or in the 
preparation of an agent for treating or preventing a 

25 PKDl-associated disorder. Therefore there is further 
provided a method of treating or preventing a PKDl- 
associated disorder which method comprises 
administering to a patient in need thereof a functional 
PKD1 gene to affected cells in a manner that permits 

30 expression of PKD1 protein therein and/or a transcript 
produced from a mutated chromosome (such as the deleted 
WS-212 chromosome) which is capable of expressing 
functional PKD1 protein therein. 

The invention also extends to any inventive 

35 combination of features set out above or in the 
following. description. 
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Brief Description Of The Drawings 

Figure la (top): A long range map of the terminal 
region of the short arm of chromosome 16 showing the 
PKD1 candidate region defined by genetic linkage 
analysis. The positions of selected DNA probes and 
5 microsatellites used for haplotype, lindage or 
heterozygosity analyses are indicated. Markers 
previously described in linkage diseguilibrium studies 
are shown in bold (from: Harris, et al. # 1990; Harris, 
et al., 1991; Germino, et al . , 1992; Somlo, et al . , 
10 1992; Peral, et al . , 1994; Snarey, et al., 1994). 

(bottom): A detailed map of the distal part of 
the PKD1 candidate region showing: the area of 16pl3.3 
duplicated in 16p.l3.1 (hatched); c, Cla I. restriction 
sites; the breakpoints in the somatic cell hybrids, N- 
15 OH1 and P-MWH2A; DNA probes and the TSC2 gene. The 
limits of the position of the translocation breakpoint 
found in family 77 (see b), determined by evidence of 
heterozygosity (in 77-4) and PFGE (see c and text) is 
also indicated. The contig covering the 77 breakpoint 
20 region consists of the cosraids: 1, CW9D; 2, ZDS5; 3, 
JH2A; 4, REP59; 5, JC10.2B; 6, CW10III; 7, SM25A; 8, 
SMI I; 9, NM17. 

Figure lb: Pedigree of family 77 which segregates 
a 16;22 translocation; showing the chromosomal 
25 composition of each subject. Individuals 77-2 and 77-3 
have the balanced products of the exchange - and have 
PKD1; 77-4 is monosomic for 1 6pl 3 . 3 -- > 1 6pt er and 
22qll.21— >22pter - and has TSC. 

Figure lc: PFGE of DNA from members of the 77 
30 family: 77-1 (1); 77-2 (2); 77-3 (3); 77-4 (4); 
digested with Cla I and hybridised with SM6 . In 
addition to the normal fragments of 340 and partially 
digested fragment of 480 kb a proximal breakpoint 
fragment of approximately 100 kb (arrowed) is seen in 
35 individuals, 77 -2 , 77-32 and 77-4; concordant wi£h 
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segregation of the der(16) chromosome. 

Figure 2: FISH of the cosmid CW10III (cosmid 6; 
Figure la) to a normal male metaphase. Duplication of 
this locus is illustrated with two sites of 

5 hybridisation on 16p; the distal site (the PKD1 region) 
is arrowed. The signal from the proximal site 
(16pl3.1) is stronger than that from the distal, 
indicating that sequences homologous to CW10III are re- 
iterated in 16pl3.1. 

10 Figure 3a: A detailed map of the 77 translocation 

region showing the precise localisation of the 77 
breakpoint and the region that is duplicated in 16pl3.1 
(hatched). DNA probes (open boxes); the transcripts, 
PKD1 and TSC2 (filled boxes; with direction of 

15 transcription indicated by an arrow) and cDNAs (grey 
boxes) are shown below the genomic map. The known 
genomic extent of each gene is indicated at the bottom 
of the diagram and the approximate genomic locations of 
each cDNA is indicated under the genomic map. The 

20 positions of genomic deletions found in PKD1 patients, 
OX87 5 and OX114, are also indicated. Restriction sites 
for EcoR I (E) and incomplete maps for BamH I (B); Sac 
I (S) and Xba I (X) are shown. SM3 is a 2kb BamHl 
fragment shown at the 5* end of the gene. 

25 Figure 3b: Southern blots of BamH I digested DNA 

from individuals: 77-1 (1); 77-2 (2); and 77-4 (4) 
hybridised with: left panel, 8S3 and right panel, 8S1 
(see a). 8S3 detects a novel fragment on the telomeric 
side of the breakpoint (12 kb: arrowed) associated 

30 with the der(22) chromosome in 77-2, but not 77-4; 
8S1 identifies a novel fragment on the centromeric side 
of the breakpoint (9 kb: arrowed) - associated with the 
der(16) chromosome - in 77-2 and 77-4. The telomeric 
breakpoint fragment is also seen weakly with 8S1 

35 (arrowed) indicating that the breakpoint lies in the 
distal part of 85.1 . The SS3 and* 8S1 loci are both 
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duplicated; the normal BamH I fragment detected at the 
16pl3.3 site by these probes is 11 kb (see a), but a 
similar sized fragment is also detected at the 16pl3.1 
site. Consequently, the breakpoint fragments are much 
fainter than the normal (16pl3.1 plus 16pl3.3) band. 

Figure 4a: PBP cDNA, 3A3 , hybridised to a Northern 
blot containing "1 mg polyA selected mRNA per lane of 
the tissue specific cell lines: lane 1, MJ, EBV- 
transformed lymphocytes; lane 2, K562 , 
erythroleukaemia; lane 3, FS1, normal fibroblasts; lane 
4, HeLa, cervical carcinoma; lane 5, G401, renal Wilm's 
tumour; lane 6, Hep3B, hepatoma; lane 7, HT29, colonic 
adenocarcinoma; lane 8, SW13, adrenal carcinoma; lane 
9, G-CCM, astrocytoma. a single transcript of 
approximately 14 kb is seen; the highest level of 
expression is in fibroblasts and in the astrocytoma 
cell line, G-CCM. Although in this comparative 
experiment little expression is seen in lanes 1, 4 and 
7, we have demonstrated at least a low level of 
expression in these cell lines on other Northern blots 
and by RT-PCR (see later). 

Figure 4b: A Northern blot containing ~ 20 mg of 
total RNA from the cell line G-CCM hybridised with 
cDNAs or a genomic probe which identify various parts 
25 of the PBP gene. Left panel, a single "14 kb 
transcript is seen with a cDNA from the single copy 
area, 3A3 . Right panel, a cDNA, 21P.9, that is 
homologous to parts of the region that is duplicated 
(JH12, JH8 and JH10; see Figure 3a) hybridises to the 
30 PBP transcript and three novel transcripts; HG-A (~ 21 
kb), HG-B (" 17 kb) and HG-C ( 8 . 5 kb ) . A similar 
pattern of transcripts is seen with cDNAs and genomic 
fragments that hybridise to the area between JH5 and 
JH13, with the exception of the JH8 area. Middle 
35 panel, JH8 hybridises to the transcripts PBP, HG-A and 
HG-B but not to HG-C. - ' 
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Figure 4c: A Northern blot of 20mg total 
fibroblast RNA from: normal control (N); 77-2 (2); 77-4 
(4) hybridised with 8S1, which contains the 16;22 
translocation breakpoint (see Figure 3). A transcript 

5 of ~ 9 kb (PBP-77) is identified in the two patients 
with this translocation but not in the normal control. 
PBP-77 is a chimeric PBP transcript formed due to the 
translocation and is not seen in 77-2 or 77-4 RNA with 
probes which map distal to the breakpoint. 

10 Figure 5a: FIGE of DNA from: normal (N) and ADPKD 

patient 0X875 (875), digested with EcoR I and 
hybridised with, left panel, CW10; middle panel, JH1. 
Normal fragments of 41 kb (plus a 31 kb fragment from 
the 16pl3.1 site), CW10, and 18 kb, JHI, are identified 

15 with these probes; OX875 has an additional 5 3 kb band 
(arrowed). The EcoR I site separating these two 
fragments is removed by the deletion (see Figure 3a). 
The right panel shows a Southern blot of BamH I 
digested DNA (as above) hybridised with 1A1H.6. A 

20 novel fragment of 9.5 kb is seen in OX875 DNA, as well 
as the normal 15 kb fragment. These results indicate 
that 0X875 has a 5 . 5 kb deletion; its position was 
determined more precisely by mapping relative to two 
Xba I sites which flank the deletion (see figure 3a). 

25 Figure 5b: Northern blot of total fibroblast RNA, 

as (a), hybridised with the cDNAs, AH4, 3A3 and AH3 . A 
novel transcript (PBP-875) of " 11 kb is seen with AH 4 
(the band is reduced in intensity because the probe is 
partly deleted) and AH3 (arrowed), which flank the 

30 deletion, but not 3A3 which is entirely deleted (see 
figure 3a). The transcripts HG-A, HG-B and HG-C, from 
the duplicated area, are seen with AH 3 (see figure 4b). 

Figure 5c: Left panel; FIGE of DNA from: normal 
(N) and ADPKD patient OX114 (114), digested with EcoR I 
• 35 and hybridised with CW10; a novel fragment of 39 kb 
(arrowed) is seen in OX114. Middle panel; DNA, as 
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above, plus the normal mother (M) and brother (B) of 
OX114 digested with BamH I and hybridised with CW21. A 
larger than normal fragment of 19 kb (arrowed) was 
detected in 0X114 but not other family members due to 
deletion of a BamH I site; together these results are 
consistent with a 2 kb deletion (see Figure 3a). Right 
panel; RT-PCR of RNA, as above, with primers flanking 
the 0X114 deletion (see Experimental Procedures). A 
novel fragment of 810 bp (arrowed) is seen in 0X114, 
indicating a deletion of 446 bp in the PBP transcript. 

Figure 5d: RT-PCR of RNA from: ADPKD patient OX32 
(32) plus the probands, normal mother (M) and affected 
father (F) and sibs (1) and (2) using the C primer pair 
from 3A3 (see Experimental Procedures). A novel 
15 fragment of 125 bp is detected in each of the affected 
individuals . 

Figure 6: Map of the region containing the TSC2 
and PBP genes showing the area deleted in patient WS-53 
and the position of the 77 translocation breakpoint. 
Localisation of the distal end of the WS-53 deletion 
was previously described (European Chromosome 16 
Tuberous Sclerosis Consortium, 1993) and we have now 
localised the proximal end between SM6 and JHI7 . The 
size of the aberrant Mlu I fragment in WS-53, detected 
by JH1 and JH17, is 90kb and these probes lie on 
adjacent Mlu I fragments of 120kb and 70kb, 
respectively. Therefore the WS-53 deletion is " lOOkb. 
Restriction sites for: Mlu I (M) ; Nru I (R); Not I (N); 
and partial maps for Sac II (S) and BssH II (H) are 
30 shown. DNA probes (open boxes) and the TSC2 and PBP 
transcripts (filled boxes) are indicated below the line 
with their known genomic extents (brackets). The 
locations of the microsatellites KG 8 and SM6 are also 
indicated. 

35 Figure 7:, The partial nucleotide sequence (cDNA) 

of the PKD1 transcript extending 5631bp to uhe 3' end 
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of the gene. The corresponding predicted protein (also 
shown in SEQ ID NO: 4:) is shown below the sequence and 
extends from the start of the nucleotide sequence. The 
GT-repeat, KG8, is in the 3* untranslated region 
5 between 5430-5448 bp. This sequence corresponds to 
GenBank Accession No. L33243 and is shown in SEQ ID NO: 
3: . 

Figure 8: The sequence of the probe 1A1H0 . 6 (also 
shown in SEQ ID NO: 5:). 

10 Figure 9: The sequence ( SEQ ID NO: 6:) of the 

probe CW10 which is about 0.5kb. 

Figure 10: The larger partial nucleotide sequence 
(SEQ ID NO: 1:) of the PKD1 transcript (cDNA) extending 
from bp 2 to 13807bp to the 3' end of the gene together 

15 with the corresponding predicted protein (also shown in 
SEQ ID NO: 2:). This larger partial sequence 
encompasses the (smaller) partial sequence of Figure 7 
from amino acid no. 272 6 in SEQ ID NO: 3: and relates 
to the entire PKD1 gene sequence apart from its extreme 

20 5' end. 

Figure 11: A map of the 75bp intron amplified by 
the primer set 3A3C insert at position 3696 of the 3' 
sequence showing the positions of genomic deletions 
found in PKD1 patients 461 and OX1054. 

25 Figure 12: A map of the region of chromosome 16 

containing the TSC2 and PKD1 genes showing the areas 
affected in patients WS-215 f WS-250, WS-212, WS-194, 
WS-227 and WS-219; also WS-53 (but cf . Figure 6). 
Genomic sites for the enzymes Mlul (M) , Clal (C) , Pvul 

30 (P) and Nrul (R) are shown. Positions of single copy 
probes and cosmids used to screen for deletions are 
shown below the line which represents ~400kb of genomic 
DNA. The genomic distribution of the approximately 
45kb TSC2 gene and known extent of the PKD1 gene are 

35 indicated above. The hatched area represents an ~50kb 
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region which is duplicated more proximally on 
chromosome 16p. 

Detaile d Description of the Drawings 
A translocation associated with ADPKD 

A major pointer to the identity of the PKD1 gene 
was provided by a Portuguese pedigree (family 77) with 
both ADPKD and TSC (Figure lb). Cytogenetic analysis 
showed that the mother, 77-2, has a balanced 
translocation, 46XX t ( 16 ; 22 ) (pl3 . 3 ; gll . 2 1 ) which was 
inherited by her daughter, 77-3. The son, 77-4, has 
the unbalanced karyotype, 45XY-16-22+der ( 16 ) ( 16gter— > 
16pl3.3: :22qll.21— >2qter) and consequently is mono- 
somic for 16pl3 . 3 — >16pter as well as for 22qll.21— > 
22pter. This individual has the clinical phenotype of 
TSC (see Experimental Procedures); the most likely 
explanation is that the TSC2 locus located within 
16pl3.3 is deleted in the unbalanced karyotype. 

Further analysis revealed that the mother (77-2), 
and the daughter (77-3) with the balanced 
translocation, have the clinical features of ADPKD (see 
Experimental Procedures), while the parents of 77-2 
were cytogenetically normal, with no clinical features 
of TSC and no renal cysts on ultrasound examination 
(aged 67 and 82 years). Although kidney cysts can be a 
25 feature of TSC, no other clinical signs of TSC were 
identified in 77-2 or 77-3, making it unlikely that the 
polycystic kidneys were due to TSC. We therefore 
investigated the possibility that the translocation 
disrupted the PKD1 locus in 16pl3.3 and proceeded to 
30 identify and clone the region containing the 
breakpoint . 

The 77 family was analysed with polymorphic 
markers from 16pl3.3. Individual 77-4 was heraizygous 
for MS205.2 and GGG1, but heterozygous for SM6 and more 
proximal markers, locating the translocation breakpoint 
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between GGG1 and SM6 (see Figure la). Fluorescence in 
situ hybridisation (FISH) of a cosraid from the TSC2 
region, CW9D (cosmid 1 in Figure la), to metaphase 
spreads showed that it hybridised to the der (22 ) 

5 chromosome of 77-2; placing the breakpoint proximal to 
CW9D and indicating that 77-4 was hemizygous for this 
region consistent with his TSC phenotype. DNA from 
members of the 77 family was digested with Cla I, 
separated by PFGE and hybridised with SM6; revealing a 

10 breakpoint fragment of ~ 100 kb in individuals with the 
der(16) chromosome (Figure lc). The small size of this 
novel fragment enabled the breakpoint to be localised 
distal to SM6 in a region of just 60 kb (Figure la). A 
cosmid contig covering this region was therefore 

15 constructed (see Experimental Procedures for details). 

The translocation breakpoint lies within a region 
duplicated elsewhere on chromosome 16p (16pl3.1) 

It was previously noted that the region between 
CW21 and N54 (Figure la) was duplicated at a more 

20 proximal site on the short arm of chromosome 16 
(Germino r et al., 1992; European Chromosome 16 
Tuberous Sclerosis Consortium, 1993). Figure 2 shows 
that a cosmid, CW10III, from the duplicated region 
hybridises to two points on 16p; the distal, PKD1 

25 region and a proximal site positioned in 16pl3.1. The 
structure of the duplicated area is complex with each 
fragment present once in 16pl3.3 re-iterated two-four 
times in 16pl3.1 (see Figure 2). Cosmids spanning the 
duplicated area in 16pl3.3 were subcloned (see Figure 

30 3a and Experimental Procedures for details) and a 
restriction map was generated. A genomic map of the 
PKD1 region was constructed using a radiation hybrid, 
Hyl45.19 which contains the distal portion of 16p but 
not the duplicate site in 16pl3.1. 

35 To localise the 77 translocation breakpoint, 

subclones from the target region were hybridised to 77- 
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2 DNA, digested with Cla I and separated by PFGE. Once 
probes mapping across the breakpoint were identified 
they were hybridised to conventional Southern blots of 
77 family DNA . Figure 3b shows that novel BamH I 
5 fragments were detected from the centromeric and 
telomeric side of the breakpoint, which was localised 
to the distal part of the probe 8S1 (Figure 3a). 
Hence, the balanced translocation was not associated 
with a substantial deletion, and the breakpoint was 
10 located more than 20 kb proximal to the TSC2 locus 
(Figure 3a). These results supported the hypothesis 
that polycystic kidney disease in individuals with the 
balanced translocation (77-2 and 77-3) was not due to 
disruption of the TSC2 gene, but indicated that a 
separate gene mapping just proximal to TSC2, was likely 
to be the PKD1 gene. 

The polycystic breakpoint (PBP) gene is disrupted by 
the translocation 

Localisation of the 77 breakpoint identified a 
precise region in which to look for a candidate for the 
PKD1 gene. During the search for the TSC2 gene we 
identified other transcripts not associated with TSC 
including a large transcript (" 14 kb ) partially 
represented in the cDNAs 3A3 and AH 4 which mapped to 
25 the genomic fragments CW23 and CW21 (Figure 3a). The 
orientation of the gene encoding this transcript had 
been determined by the identification of a polyA tract 
in the cDNA, AH 4 : the 3' end of this gene lies very 
close to the TSC gene, in a tail to tail orientation 
(European Chromosome 16 Tuberous Sclerosis Consortium, 
1993). To determine whether this gene crossed the 
translocation breakpoint genomic probes from within the 
duplicated area and flanking the breakpoint were 
hybridised to Northern blots. Probes from both sides of 
the breakpoint, between JH5 and JH13 identified the 14 
kb transcript (Figure 3a and see below for details). 
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Therefore, this gene previously called 3A3 , but now 
designated the PBP gene extended over the 77 
breakpoint and consequently was a candidate for the 
PKD1 gene. A walk was initiated to increase the extent 

5 of the PBP cDNA contig and several new cDNAs were 
identified using probes from the single copy (non- 
duplicated) region (see Experimental Procedures for 
details). A cDNA contig was constructed which extended 
~5.7 kb, including "2 kb into the area that is 

10 duplicated (Figure 3a) . 

Expression of the PBP gene 

Initial studies of the expression pattern of the 
PBP gene were undertaken with cDNAs that map entirely 
within the single copy region (e.g. AH4 and 3A3). 

15 Figure 4a shows that the " 14 kb transcript was 
identified by 3A3 in various tissue-specific cell 
lines . From this and other Northern blots we concluded 
that the PBP gene was expressed in all of the cell 
lines tested, although often at a low level. The two 

20 cell lines which showed the highest level of expression 
were fibroblasts and a cell line derived from an 
astrocytoma, G-CCM. Significant levels of expression 
were also obtained in cell lines derived from kidney 
(G401) and liver (Hep3B) . Measuring the expression of 

25 the PBP gene in tissue samples by Northern blotting 
proved difficult because such a large transcript is 
susceptable to minor RNA degradation. However, initial 
results with an RNAse protection assay, using a region 
of the gene located in the single copy area (see 

30 "Experimental Procedures), showed a moderate level of 
expression of the PBP gene in tissue obtained, from 
normal and polycystic kidney (data not shown). The 
widespread expression of the PBP gene is consistent 
with the systemic nature of ADPKD. 

35 
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Identification of transcripts that are partially 
homologous to the PBP transcript 

New cDNAs were identified with the genomic 
fragments, JH4 and JH8 , that map to the duplicated 
5 region (Figure 3a and see Experimental Procedures). 
However, when these cDNAs were hybridised to Northern 
blots a more complex pattern than that seen with 3A3 
was observed. As well as the "14 kb' PBP transcript, 
three other, partially homologous transcripts were 
10 identified designated homologous gene-A (HG-A; * 21 
kb), HG-B (~ 17 kb) and HG-C (8.5 kb) (Figure 4b). 
There were two possible explanations for these results, 
either the HG transcripts were alternatively spliced 
forms of the PBP gene, or the HG transcripts were 
15 encoded by genes located in 16pl3.1. To determine the 
genomic location of the HG loci a fragment from the 3' 
end of one HG cDNA (HG-4/1.1) was isolated. HG-4/1.1 
hybridised to all three HG transcripts, but not to the 
PBP transcript and on a hybrid panel it mapped to 
20 16pl3.1 (not the PKD1 area). These results show that 
all the HG transcripts are related to each other 
outside the region of homology with the PBP transcript 
and that the HG loci map to the proximal site 
(16pl3.1). 

25 An abnormal transcript associated with the 77 
translocation 

As the PBP gene was transcribed across the region 
disrupted by the 77 translocation breakpoint, in a 
proximal to distal direction on the chromosome (see 
Figure 3a) it was possible that a novel transcript 
originating from the PBP promotor would be found in 
this family. Figure 4c shows that using a probe to the 
PBP transcript that mapped mainly proximal to the 
breakpoint, a novel transcript of approximately 9 kb 
35 (PPP-77) derived from the der(16) product of the 
translocation was detected. Interestingly, the PBP-77 
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transcript appears to be expressed at a higher level 
than the normal PBP product- These results confirmed 
that the 77 translocation disrupts the PBP gene and 
supports the hypothesis that this is the PKD1 gene. 

5 Mutations of the PBP gene in other ADPKD patients 

To prove that the PBP gene is the defective gene 
at the PKD1 locus, we analysed this region for 
mutations in patients with typical ADPKD. The 3* end 
of the PBP gene was most accessible to study as it maps 

10 outside the duplicated area. To screen this region 
BamH I digests of DNA from 282 apparently unrelated 
ADPKD patients were hybridised with the probe 1A1H.6, 
(see Figure 3a). In addition, a large EcoR I fragment 
(41 kb) which contains a significant proportion of the 

15 PBP gene was assayed by field inversion gel 
electrophoresis (FIGE) in 167 ADPKD patients, using the 
probe CW10. Two genomic rearrangements were identified 
in ADPKD patients by these procedures; each identified 
by both methods. 

20 The first rearrangement was identified in patient 

OX875 (see Experimental Procedures for clinical 
details) who was shown to have a 5.5 kb genomic 
deletion within the 3* end of the PBP gene, producing a 
smaller transcript (PBP-875) (see Figures 5a, b and 3a 

25 for details). This genomic deletion results in a "3 kb 
internal deletion of the transcript with the ~500 bp 
adjacent to the polyA tail intact. In this family 
linkage of ADPKD to chromosome 16 could not be proven 
because although OX875 has a positive family history of 

30 ADPKD there were no living, affected relatives. 
However, paraffin-embedded tissu ;: rom her affected 
father (now deceased) was available. We demonstrated 
that this individual had the same rearrangement as 
0X875 by PCR amplification of a 220bp fragment spanning 

35 the deletion (data not shown). This result and 
analysis of two unaffected sibs of OX875, that did not 
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have the deletion, showed that this mutation was 
transmitted with ADPKD. 

The second rearrangement detected by hybridisation 
was a 2 kb genomic deletion within the PBP gene, in 
ADPKD patient OX114 (see Experimental Procedures for 
clinical details and Figures 5c and 3a). No abnormal 
PBP transcript was identified by Northern blot 
analysis, but using primers flanking the deletion (see 
Experimental Procedures) a shortened product was 
detected by RT-PCR (Figure 5c). This was cloned and 
sequenced and shown to have a frame-shift deletion of 
446 bp (between base pair 1746 and 2192 of the sequence 
shown in Figure 7). OX114 is the only member of the 
family with ADPKD (she has no children) and ultrasound 
15 analysis of her parents at age 78 (father) and 73 years 
old (mother) showed no evidence of renal cysts. 
Somatic cell hybrids were produced from OX114 and the 
deleted chromosome was found to be of paternal origin 
by haplotype analysis. The father of OX114 is now 
20 deceased but analysis of DNA from the brother of OX114 
(0X984) with seven microsatellite markers from the PKD1 
region (see Experimental Procedures) showed that he 
shares the same paternal chromosome, in the PKD1 
region, as OX114. Renal ultrasound revealed no cysts 
25 in OX984 at age 53 and no deletion was detected by DNA 
analysis (Figure 5c). Hence, the deletion in 0X114 is 
a de novo event associated with the development of 
ADPKD. Although it is not possible to show that the 
ADPKD is chromosome 16-linked, the location of the PBP 
30 gene indicates that this is a de novo PKD1 mutation. 

To identify more PKD1 associated mutations, single 
copy regions of the PBP gene were analysed by RT-PCR 
using RNA isolated from lymphoblas toid cell lines 
established from ADPKD patients. cDNA from 48 unrelated 
35 patients was amplified with the primer pair 3A3 C (see 
Experimental Procedures) and the product of 2 60 bp was 
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analysed on an agarose gel. In one patient, OX32, an 
additional smaller product (125 bp) was identified, 
consistent with a deletion or splicing mutation. OX32 
comes from a large family in which the disease can be 
5 traced through three generations. Analysis of RNA from 
two affected sibs of OX32 and his parents showed that 
the abnormal transcript segregates with PKD1 (Figure 
5d) . 

Amplification of normal genomic DNA with the 3A3 C 

10 primers generates a product of 418 bp; sequencing 
showed that this region contains two small introns (5', 
75 bp and 3', 83 bp) flanking a 135 bp exon. The 
product amplified from OX32 genomic DNA was normal in 
size, excluding a genomic deletion. However, 

15 heteroduplex analysis of that DNA revealed larger 
heteroduplex bands, consistent with a mutation within 
that genomic interval. The abnormal OX32, RT-PCR 
product was cloned and sequenced: this demonstrated 
that, although present in genomic DNA, the 135 bp exon 

20 was missing from the abnormal transcript. Sequencing 
of 0X32 genomic DNA demonstrated a G — >C transition at 
+1 of the splice donor site following the 135 bp exon. 
This mutation was confirmed in all available affected 
family members by digesting amplified genomic DNA with 

25 the enzyme Bst NI: a site is destroyed by the base 
substitution. The splicing defect results in an in- 
frame deletion of 135 bp from the PBP transcript (3696 
bp to 3831 bp of the sequence shown in Figure 7). 
Together, the three intragenic mutations confirm that 

30 the PBP gene is the defective gene at the PKD1 locus. 
Deletions that disrupt the TSC2 and the PKD1 gene 

We previously identified a deletion (WS-53) which 
disrupts the TSC2 gene and the PKD1 gene (European 
Chromosome 16 Tuberous Sclerosis Consortium, 1993), 

35 although its full proximal extent was not determined. 
Further study has shown that the deletion extends " 100 
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kb (see Figure 6 for details) and deletes most if not 
all of the PKD1 gene. This patient has TSC but also 
has unusually severe polycystic disease of the kidneys. 
Other patients with a similar phenotype have also been 
5 under investigation. Deletions involving both TSC2 and 
PKD1 were identified and characterised in six patients 
in whom TSC was associated with infantile polycystic 
kidney disease. As well as the deletion in WS-53, 
those in WS-215 and "S-250 also extended proximally 
10 well beyond the known distribution of PKD1 and probably 
delete the entire gene. The deletion in WS-194 
extended over the known extend of PKD1, but not much 
further proximally, while the proximal breakpoints in 
WS-219 and WS-227 lay within PKD1 itself. Northern 
15 analysis of case WS-219 with probe JH8, which lies 
outside the deletion, showed a reduced level of the 
PKD1 transcript but no evidence of an abnormally sized 
transcript (data now shown) . Analysis of samples from 
the clinically unaffected parents of patients 
20 WS-215, WS-219, WS-227 and WS-250 showed the deletions 
in these patients to be de novo. The father of WS-194 
was unavailable for study. 

In a further case (WS-212), renal ultrasound 
showed no cysts at four years of age but a deletion was 
25 identified which removed the entire TSC2 gene and' 
deleted an Xbal site which is located 42bp 5* to the 
polyadenylation signal of PKD1 . To determine the 
precise position of the proximal breakpoint in PKD1, a 
587bp probe from the 3 • untranslated region ( 3 • UTRP) 
30 was hybridised to Xbal digested DNA . A 15kb xbaL 
breakpoint fragment was detected with an approximately 
equal intensity to the normal fragment of 6kb, 
indicating that most of the PKD13 ' UTR was preserved on 
the mutant chromosome. Evidence that a PKD1 transcript 
35 is produced from the deleted chromosome in WS-212 was 
obtained by 3' rapid identification of cDNA ends (RACE) 
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with a novel, smaller product generated from WS-212 
cNDA . Characterisation of this product showed that 
polyadenylation occurs 546bp 5* to the normal position, 
within the 3'UTR of PKD1 (231bp 3' to the stop codon at 

5 5073bp of the described PKD1 sequence 14 ). A 
transcript with an intact open reading frame is thus 
produced from the deleted WS-212 chromosome. It is 
likely that a functional PKD1 protein in produced from 
this transcript, explaining the lack of cystic disease 

10 in this patient. The sequence preceeding the novel 
site of polyA addition is: 

AGTCAGT AATTTA TATGGTGTTAAAATGTG ( A ) n . Although not 
conforming precisely to the concensus of AATAAA, it is 
likely that part of this AT rich region acts as an 

15 alternative polyadenylation signal if, as in this case, 
the normal signal is deleted (a possible sequence is 
underlined) . 

The WS-212 deletion if 75kb between SM9-CW9 
distally and the PKD1 3 ' UTR proximally. The WS-215 

20 deletion is 160kb between CW15 and SM6-JH17. WS-194 
has 65kb deleted between CW20 and CW10-CW36. WS-227 
has a 50kb deletion between CW20 and JH11 and WS-219 
•has a 27kb deletion between JH1 and JH6 . The distal 
end of the WS-250 deletion is in CW20 but the precise 

25 location of the proximal end is not known. However, 
the same breakpoint fragment of 320kb is seen with 
Pvul-digested DNA using probes on adjacent Pvul 
fragments, CE18 (which normally detects a 245kb 
fragment) and BLu24 (235kb). Hence this deletion can 

30 be estimated ~160kb. b. PFGE analysis of the deletion 
in WS-219. Mlul digested DNA from a normal control (N) 
and WS-219 probed with the clones H2, JH1, CW21 and 
CW10 which detect an ~130kb fragment in normal 
individuals. CW10 also detects a much smaller fragment 

35 fro™ the duplicated region situated more proximally on 
16p. A novel fragment: of ~100kb is seen in WS-219 with 
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probes H2 and CW10 which flank the deletion in this 
patient. jhi is partially deleted but detects the 
novel band weakly. The aberrant fragment is not 
detected by CW-2 1, which is deleted on the mutant 
5 chromosome. BamHl digested DNA of normal control (N) 
and WS-219 separated by conventional gel 
electrophoresis and hybridised to probes JHI and JH6 
which flank the deletion. The same breakpoint fragment 
of "3kb is seen with both probes, consistent with a 
10 deletion of ~27kb ending within the BamHl fragments 
seen by these probes . 
Two further deletions 

In addition we have characterised two further 
mutations of this gene which were identified in typical 
15 PKDl families. m both cases the mutation is a 
deletion in the 75bp intron amplified by the primer 
pair 3A3C (European Polycystic Kidney Disease 
Consortium, 1994). The deletions are of I8bp and 20bp, 
respectively, in the patients 461 and OX1054. Although 
these deletions do not disrupt the highly conserved 
sequences flanking the exon/intron boundaries, they do 
result in aberrant splicing of the transcript. In both 
cases, two abnormal mRNAs are produced, one larger and 
one smaller than normal. Sequencing of these cDNAs 
showed that the larger transcript includes the deleted 
intron, and so has an in-frame insertion of 57bp in 
461, while OX1054 has a frameshift insertion of 55bp. 
The smaller transcript is due to activation of a 
cryptic splice site in the exon preceding the deleted 
intron and results in an in-frame deletion of 66bp in 
both patients. The demonstration of two additional 
mutations of this gene in PKDl patients further 
confirms that this is the PKDl gene. 
Characterisation of the PKDl gene 
35 To characterise the PKDl gene further, 

evolutionary conservation was analysecT by- "-zo.c 
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blotting 1 . Using probes from the single copy, 3* 
region (3A3) and from the duplicated area ( JH4 , JH8) 
the PKD1 gene was conserved in other mammalian species, 
including horse, dog, pig and rodents (data not shown). 

5 No evidence of related sequences were seen in chicken, 
frog or drosophila by hybridisation at normal 
stringency. The degree of conservation was similar 
when probes from the single copy or the duplicated 
region were employed. 

10 The full genomic extent of the PKD1 gene is not 

yet known, although results obtained by hybridisation 
to Northern blots show that it extends from at least as 
far as JH13. Several CpG islands have been localised 
5' of the known extent of the PKD1 gene (Figure 6), 

15 although there is no direct evidence that any of these 
are associated with this gene. 

The cDNA contig extending 5631 bp to the 3* end of 
the PKD1 transcript was sequenced; where possible more 
than one cDNA was analysed and in all regions both 

20 strands were sequenced (Figure 7). We estimated that 
this accounts for "40% of the PKD1 transcript. An 
open reading frame was detected which runs from the 5' 
end of the region sequenced and spans 4842 bp, leaving 
a 3' untranslated region of 789 bp which contains the 

25 previously described microsatellite , KG8 (Peral, et 
al., 1994; Snarey, et al., 1994). A polyadenylation 
signal is present at nucleotides 5598-5603 and a polyA 
tail was detected in two independent cDNAs (AH 4 and 
AH6) at position, 5 620, Comparison with the cDNAs HG- 

30 4 and 11BHS21, which are encoded by genes in the 
duplicate, 16pl3 . 1 region, show that 1866 bp at the 5* 
end of the partial PKD1 sequence shown in Figure 7 lies 
within the duplicated area. The predicted amino acid 
sequence from the available open reading frame extends 

35 1614 residues, and is shown in Figure 7. A search of 
the sv/iccprot and NBRF data bases with the available 
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protein sequence, using the Blast programme (Altschul, 
et al., 1990) identified only short regions of 
similarity (notably, between amino-acids 690-770 and 
1390-1530) to a diverse group of proteins; no highly 
significant areas of homology were recognised. The 
importance of the short regions of similarity is 
unclear as the search for protein motifs with the 
ProSite Programme did not identify any recognised 
functional protein domains within the PKD1 gene. 

The task of identifying and characterising the 
PKD1 gene has been more difficult than for other 
disorders because more than three quarters of the gene 
is embedded in a region of DNA that is duplicated 
elsewhere on chromosome 16. This segment of 40-50 kb 
of DNA, present as a single copy in the PKD1 area 
(16pl3.3), is re-iterated as several divergent copies 
in the more proximal region, 16pl3.1. This proximal 
site contains three gene loci (HG-A, -B and -C) that 
each produce polyadenylated mRNAs and share substantial 
homology to the PKD1 gene; it is not known whether 
these partially homologous transcripts are translated 
into functional proteins . 

Although gene amplification is known as a major 
mechanism for creating protein diversity during 
25 evolution, the discovery of a human disease locus 
embedded within an area duplicated relatively recently 
is a new observation. in this case because of the 
recent nature of the reiteration the whole duplicated 
genomic region retains a high level of homology, not 
30 just the exons. The sequence of events leading to the 
duplication and which sequence represents the original 
gene locus are not yet clear. However, early evidence 
of homology of the 3" ends of the three HG transcripts 
which are different from the 3* end of the PKD1 gene 
35 indicated that the loci in 16pl3.1 have probably arisen 



20 
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by further reiteration of sequences at this site, after 
it separated from the distal locus . 

To try to overcome the duplication problem we have 
employed an exon linking approach using RNA isolated 

5 from a radiation hybrid, Hyl45.19, that contains just 
the PKD1 part of chromosome 16, and not the duplicate 
site in 16pl3 . 1 • Hence, this hybrid produces 
transcripts from the PKD1 gene but not from the 
homologous genes (HG-A, HG-B and HG-C). We have also 

10 sequenced much of the genomic region containing the 
PKD1 gene, from the cosmid JH2A, and have sequenced a 
number of cDNAs from the HG locus. To determine the 
likely position of PKD1 exons in the genomic DNA we 
compared HG cDNAs, (HG-4 and HG-7) to the genomic 

15 sequence. We then designed primers with sequences 
corresponding to the genomic DNA, to regions identified 
by the HG exons and employing cDNA generated from the 
hybrid Hyl45.19, we amplified sections of the PKD1 
transcript. The polymerase Pfu was used to minimise 

20 incorporation errors. These amplified fragments were 
then cloned and sequenced. The PDK1 cDNA contig whose 
sequence is shown in Figure 10 is made up of (3'-5 f ) 
the original 5.7 kb of sequence shown in Figure 7, and 
the cDNAs : gap a 2 2 (8 90 bp), gap gamma (8 72 bp), a 

25 section of genomic DNA from the clone JH8 (2,724 bp) 
which corresponds to a large exon, S1-S3 (733 bp), S3- 
S4 (1,589 bp) and S4-S13 (1,372 bp). Together these 
make a cDNA of 13,807 bp with the extreme 5' end of the 
transcript still uncharacterised . When these cDNAs 

30 from the PKD1 contig were sequenced an open reading 
frame was found to run from the start of the contig to 
the previously-identified stop codon, a region of 
13,018 bp. The predicted protein encoded by the PKD1 
transcript is also shown in Figure 10 and has 4,339 

35 amino acid residues . 
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We have therefore compelling evidence that 
mutations of the PKD1 gene give rise to the typical 
phenotype of ADPKD. The location of this gene within 
the PKD1 candidate region and the available genetic 

5 evidence from the families with mutations show that 
this is the PKD1 gene. The present invention therefore 
includes' the PKD1 gene itself and the six PKD1- 
associated mutations which have been described: a de 
novo translocation, which was subseguently transmitted 

10 with the phenotype; two intragenic deletions (one a de 
novo event); two further deletions; and a splicing 
defect. 

It has previously been argued that PKD1 could be 
recessive at the cellular level, with a second somatic 
15 mutation required to give rise to cystic epithelium 
(Reeders, 1992). This "two hit" process is thought to 
be the mutational mechanism giving rise to several 
dominant diseases, such as neurofibromatosis (Legius, 
et al., 1993) and tuberous sclerosis (Green, et al . , 
20 1994) which result from a defect in the control of 
cellular growth. If this were the case, however, we 
might expect that a proportion of constitutional PKD1 
mutations would be inactivating deletions as seen in 
these other disorders . 
25 The location of the PKD1 mutations may, however, 

reflect some ascertainment bias as it is this single 
copy area which has been screened most intensively for 
mutations. Nevertheless, no additional deletions were 
detected when a large part of the gene was screened by 
30 FIGE, and studies by PFGE showed no large deletions of 
this area in 75 PKD1 patients. It is possible that the 
mutations detected so far result in the production of 
an abnormal protein which causes disease through a gain 
of function. However, it is also possible that these 
35 mutations eliminate the production of functional 
^protein from this- chromosome and result in the FKDi 
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phenotype by haploinsuf f iciency , or only after loss of 
the second PKD1 homologue by somatic mutation. 

At least one mutation which seems to delete the 
entire PKD1 gene has been identified (WS-53) but in 

5 this case it also disrupts the adjacent TSC2 gene and 
the resulting phenotype is of TSC with severe cystic 
kidney disease. Renal cysts are common in TSC so that 
the phenotypic significance of deletion of the PKD1 
gene in this case is difficult to assess. It is clear 

10 that not all cases of renal cystic disease in TSC are 
due to disruption of the PKD1 gene; chromosome 9 linked 
TSC (TSC1) families also manifest cystic kidneys and we 
have analysed many TSC2 patients with kidney cysts who 
do not have deletion of the PKD1 gene. 

15 Preliminary analysis of the PKD1 protein sequence 

has highlighted two regions which provide some clues to 
the possible function of the PKD1 gene. At the extreme 
5 1 end of the characterised region are two leucine-rich 
repeats (LRRs) (amino acids 29-74) flanked by 

20 characteristic amino flanking (amino acids 6-28) and 
carboxy flanking sequences (amino acids 76-133) 
(Rothberg et al, 1990). LRRs are thought to be 
involved in protein-protein interations (Kobe and 
Deisenhofer, 1994) and the flanking sequences are only 

25 found in extracellular proteins. Other proteins with 
LRRs flanked on the amino and carboxy sides are 
receptors or are involved in adhesion or cellular 
signalling. Further 3 f on the protein (amino acids 
350-515) is a C-type lectin domain (Curtis et al, 

30 1992). This indicates that this region binds 
carbohydrates and is also likely to be extracellular. 
These two regions of homology indicate that the 5* part 
of the PKD1 protein is extracellular and involved in 
protein-protein interactions. It is possible that this 

35 protein is a constituent of, or plays a role in 
assembling, the extracellular matrix (ECM) and may act 



3DOCID: <WO 9518225A1 J_> 



WO 95/18225 



PCT/GB94/02822 



- 34 - 

as an adhesive protein in the ECM. It is also possible 
that the extracellular portion of this protein is 
important in signalling to other cells. The function 
of much of the PKD1 protein is still not fully known 
5 but the presence of several hydrophobic regions 
indicates that the protein may be threaded through the 
cell membrane. 

Familial studies indicate that de novo mutations 
probably account for only a small minority of all ADPKD 
10 cases; a recent study detected 5 possible new mutations 
in 209 families (Davies, et al . , 1991). However, in 
our study one of three intragenic mutations detected 
was a new mutation and the PKD1 associated 
translocation was also a de novo event. Furthermore, 
15 the mutations detected in the two familial cases do not 
account for a significant proportion of the local PKD1 . 
The OX875 deletion was only detected in 1 of 282 
unrelated cases, and the splicing defect was seen in 
only 1 of 48 unrelated cases. Nevertheless, studies of 
20 linkage disequilibrium have found evidence of common 
haplotypes associated with PKD1 in a proportion of some 
populations (Peral, et al., 1994; Snarey, et al., 
1994) suggesting that common mutations will be 
identified. 

25 Once a larger range of mutations have been 

characterised it will be possible to evaluate whether 
the type and location of mutation determines disease 
severity, and if there is a correlation between 
mutation and extra-renal manifestations. Previous 

30 studies have provided some evidence that the risk of 
cerebral aneurysms 'runs true* in families (Huston, et 
al., 1993 ) and that some PKD1 families exhibit a 
consistently mild phenotype (Ryynanen, et al., 1987). 
A recent study has concluded that there is evidence of 

35 .^anticipation in ADPKD families, especially if the 
disease is transmitted through the mother (Fink, et 
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al., 1994). Furthermore, analysis of families with 
early manifestation of ADPKD show that there is a 
significant intra-f amilial recurrence risk and that 
childhood cases are most often transmitted maternally 

5 (Fink, et al. , 1993; Zerres, et al., 1993). This 

pattern of inheritence is reminiscent of that seen in 
diseases in which an expanded trinucleotide repeat was 
found to be the mutational mechanism (reviewed in 
Mandel, 1993). However, no evidence for an expanding 

10 repeat correlating with PKD1 has been found in this 
region although such a sequence cannot be excluded. 

There is ample evidence that early presymptomatic 
diagnosis of PKD1 is helpful because it allows 
complications such as hypertension and urinary tract 

15 infections to be monitored and treated quickly (Ravine, 
et al . , 1991). The identification of mutations within 
a family will allow rapid screening of that and other 
families with the same mutation. However, genetic 
linkage analysis is likely to remain important for 

20 presymptomatic diagnosis. The accuracy and ease of 
linkage based diagnosis will be improved by the 
identification of the PKD1 gene as a microsatellite 
lies in the 3* untranslated region of this gene (KG-8) 
and several CA repeats are located 5* of the gene (see 

25 Figure la and 6; Peral, et al., 1994; Snarey, et al., 
1994). 

Experimental Procedures 
Clinical Details of Patients 

Family 7 7 

30 77-2 and 7 7-3 are 48 and 17 years old, 

respectively, and have typical ADPKD. Both have 
bilateral polycystic kidneys and 77-2 has impaired 
renal function. Neither patient manifests any signs of 
TSC (apart from cystic kidneys) on clinical and 

35 ophthalmological examination or by CT scan of the 
brain. 
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77-4 is 13 years old, severely mentally retarded 
and has multiple signs of TSC including adenoma 
sebaceum, depigmented macules and periventricular 
calcification on CT scan. Renal ultrasound reveals a 
small number of bilateral renal cysts. 
ADPKD patients 

OX875 developed ESRD from ADPKD, aged 46. 
Progressive decline in renal function had been observed 
over 17 years; ultrasound examinations documented 
enlarging polycystic kidneys with less extensive 
hepatic cystic disease. Both kidneys were removed 
after renal transplantation and pathological 
examination showed typical advanced cystic disease in 
kidneys weighing 1920g and 3450g (normal average 120g) . 
15 OX114 developed ESRD from ADPKD aged 54: diagnosis 

was made by radiological investigation during an 
episode of abdominal pain aged 25. A progressive 
decline in renal function and the development of 
hypertension was subsequently observed. Ultrasonic 
20 examination demonstrated enlarged kidneys with typical 
cystic disease, with less severe hepatic involvement. 

0X32 is a member of a large kindred affected by 
typical ADPKD in which several members have developed 
ESRD. The patient himself has been observed for 12 
25 years with progressive renal failure and hypertension 
following ultrasonic demonstration of polycystic 
kidneys . 

No signs of TSC were observed on clinical 
examination of any of the ADPKD patients. 
30 DNA Electrophoresis and Hybridisation 

DNA extraction, restriction digests, electro- 
phoresis, Southern blotting, hybridisation and washing 
were performed by standard methods or as previously 
described (Harris, et al., 1990). FIGE was performed 
33 with the Biorad FIGE Mapper using programme 5 to 
^separate fragments from 25-50 kb . High molecular 
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weight DNA for PFGE was isolated in agarose blocks and 
separated on the Biorad CHEF DRII apparatus using 
appropriate conditions . 

Genomic DNA probes and somatic cell hybrids 

5 Many of the DNA probes used in this study have 

been described previously: MS205.2 (D16S309; Royle, et 
al., 1992); GGG1 (D16S259; Gerroino, et al., 1990); N54 
(D16S139; Himraelbauer, et al., 1991); SM6 (D16S665), 
CW23, CW21, and JH1 (European Chromosome 16 Tuberous 

10 Sclerosis Consortium, 1993). Microsatellite probes for 
haplotype analysis were KG8 and W5 . 2 (Snarey, et al., 
1994) SM6, CW3 and CW2 , (Peral, et al., 1994), 16AC2.5 
(Thompson, et al., 1992); SM7 (Harris, et al . , 1991), 
VK5AC (Aksenti jevich, et al.', 1993). 

15 New probes isolated during this study were: JH4, 

JH5, JH6, 11 kb, 6 kb and 6 kb BamH I fragments, 
respectively, and JH13 and JH14, 4 kb and 2,8 kb BamH 
I-EcoR I fragments, respectively, all from the cosmid 
JH2A; JH8 and JH10 are 4 . 5 kb and 2 kb Sac I fragments, 

20 respectively and JH12 a 0.6 Sac I-BamH I fragment, all 
from JH4 ; 8S1 and 8S3 are 2.4 kb and 0.6 kb Sac II 
fragments, respectively, from JH8; CW10 is a 0.5 kb Not 
I-Mlu I fragment of SM25A; JH17 is a 2 kb EcoR I 
fragment of NM17. 

25 The somatic cell hybrids N-OH1 (Germino, et al., 

1990), P-MWH2A (European Chromosome 16 Tuberous 
Sclerosis Consortium, 1993) and Hyl45.19 (Himmelbauer , 
et al., 1991) have previously been described. Somatic 
cell hybrids containing the paternally derived (BP2-10) 

30 and maternally derived (BP2-9) chromosomes from OX114 
were produced by the method of Deisseroth and Hendrick 
(1979). 

Constructing a cosmid contig 

Cosmids were isolated from chromosome 16 specific 
35 and total genomic libraries, and a contig was 
constructed using the methods and libraries previously 
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described (European Chromosome 16 Tuberous Sclerosis 
Consortium, 1993). To ensure that cosmids were derived 
from the 16pl3.3 region (not the duplicate 16pl3.1 
area) initially, probes from the single copy area were 
used to screen libraries (e.g. CW2 1 and N54). Two 
cosmids mapped entirely within the area duplicated, 
CW10III and JC10.2B. To establish that these were from 
the PKD1 area, they were restriction mapped and 
hybridised with the probe CW10. The fragment sizes 
detected were compared to results obtained with hybrids 
containing only the 16pl3.3 area (Hyl45.19) or only the 
16pl3.1 region (P-MWH2A). 
FISH 

FISH was performed essentially as previously 

15 described (Buckle and Rack, 1993). The hybridisation 
mixture contained 100 ng of biotin-II-dUTP labelled 
cosmid DNA and 2.5 mg human Cot-1 DNA (BRL), which was 
denatured and annealled at 37°C for 15 min prior to 
hybridisation at 42°C overnight. After stringent 

20 washes the site of hybridisation was detected with 
successive layers of f luorescein-con jugated avidin (5 
mg/ml) and biotinylated anti-avidin (5 rag/ml) (Vector 
Laboratories). Slides were mounted in Vectashield 
(Vector Laboratories) containing 1 mg/ml propidium 

25 iodide and 1 mg/ml 4', 6 -diaraidino-2-pheny lindole 
(DAPI), to allow concurrent G-banded analysis under UV 
light. Results were analysed and images captured using 
a Bio-Rad MRC 600 confocal laser scanning microscope. 
cDNA screening and characterisation 

30 Foetal brain cDNAs libraries in 1 phage (Clonetech 

and Stratagene) were screened by standard methods with 
genomic fragments in the single copy area (equivalent 
to CW23 and CW21) or with a 0.8 kb Pvu II-Eco RI single 
copy fragment of AH3 . Six PBP cDNAs were characterised 

35 including two previously described, AH 4 (1.7 kb) , 3A3 
(2.0 kb) (European Chromosome 16 Tuberous Sclerosis 
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Consortium, 1993), and four novel cDNAs AH 3 (2.2 kb) , 
AH 6 (2.0 kb), A1C (2.2 kb ) and B IE (2.9 kb ) . A 
Striatum library (Stratagene) was screened with JH4 and 
a HG-C cDNA , 11BHS21 (3.8 kb) was isolated; 21P.9 is 

5 a 0.9 kb Pvu II-EcoR I subclone of this cDNA. A HG-A 
or HG-B cDNA, HG-4 (7 kb ) was also isolated by- 
screening the foetal brain library (Stratagene) with 
JH8. HG-4/1.1 is a 1.1 kb Pvu II-EcoR I fragment from 
the 3' end of HG-4. 1A1H.6 is a 0.6 kb Hind III-EcoR I 

10 subclone of a TSC2 cDNA, 1A-1 (1.7 kb), which was 
isolated from the Clonetech library. Each cDNA was 
subcloned into Bluescript and sequenced utilising a 
combination of sequential truncation and 
oligonucleotide primers using DyeDeoxy Terminators 

15 (Applied Biosystems) and an ABI 373A DNA Sequencer 
(Applied Biosystems) or by hand with ' Sequenase' T7 DNA 
polymerase (USB) . 
RNA Procedures 

Total RNA was isolated from cell lines and tissues 

20 by the method of Chomczynski and Sacchi (1987) and 
enrichment for mRNA made using the PolyAT tract mRNA 
Isolation System (Promega). For RNA electrophoresis 
0.5% agarose denaturing formaldehyde gels were used 
which were Northern blotted, hybridised and washed by 

25 standard procedures. The 0.24 - 9.5 kb RNA (Gibco BRL) 
size standard was used and hybridisation of the probe 
(1-9B3) to the 13 kb Utrophin transcript (Love, et al., 
1989) in total fibroblast RNA was used as a size marker 
for the large transcripts. 

30 RT-PCR was performed with 2.5 mg of total RNA by 

the method of Brown et al (1990) with random hexamer 
primers, except that AMV-reverse transcriptase (Life 
Sciences ) was employed . To characterise the deletion 
of the PBP transcript in OX114 we used the primers : 

35 
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AH3 F9 5' TTT GAC AAG CAC ATC TGG CTC TC 3' 

AH3 B7 5' TAC ACC AGG AGG CTC CGC AG 3' 

in a DMSO containing PCR buffer (Dode, et al., 1990) 
5 with 0.5 mM MgCl 2 and 36 cycles of: 94°C, 1 min; 61°C, 
1 min; 72°C / 2 min plus a final extension of 10 rain. 
The 3A3 C primers used to amplify the 0X32 cDNA and DNA 
were : 

3A3 CI 5' CGC CGC TTC ACT AGC TTC GAC 3' 

10 3A3 C2 5 1 ACG CTC CAG AGG GAG TCC AC 3 • 

These were employed in a PCR buffer and cycle 
previously described (Harris, et al., 1991) with IraM 
MgCl 2 and an annealing temperature of 61°C. 

PCR products for sequencing were amplified with 
15 Pfu-l (Stratagene) and ligated into the Srf-1 site in 
PCR-Script (Stratagene) in the presence of Srf-1. 
RNAse protection 

Tissues from normal and end-stage polycystic 
kidneys were immediately homogenised in guanidinium 
20 thiocyanate. RNA was purified on a cesium chloride 
gradient and 30 mg total RNA was assayed by RNAse 
protection by the method of Melton, et al . , (1984) 
using a genomic template generated with the 3A3, C 
primers . 
25 Heteroduplex Analysis 

Heteroduplex analysis was performed essentially as 
described by Keen et al (1991). Samples were amplified 
from genomic DNA with the 3A3 , C primers, heated at 
95°C for 5 minutes and incubated at room temperature 
30 for at least 30 minutes before loading on a Hydrolink 
gel (AT Biochem) . Hydrolink gels were run for 12-18 
hours at 250V and fragments observed after staining 
with ethidium bromide. 

Extraction and amplification of paraffin-embedded DNA 

35 DN A from formalin fixed, paraffin wax embedded 

kidney tissue was prepared by the method of Wright and 
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Manos (1990), except that after proteinase K digestion 
overnight at 55°C, the DNA was extracted with phenol 
plus chloroform before ethanol precipitation. 
Approximately 50 ng of DNA was used for PCR with 1.5 mM 

5 MgCl 2 and 40 cycles of 94°C for 1 min, 59°C for 1 min 
and 72°C for 40 s, plus a 10 min extension at 72°C. 
The oligonucleotide primers designed to amplify across 
the genomic deletion of OX875 were: 
AH4F2 : 5' - GGG CAA GGG AGG ATG ACA AG - 3 * 

10 JH14B3 : 5' - GGG TTT ATC AGC AGC AAG CGG - 3' 

which produced a product of ~ 220 bp in individuals 
with the OX875 deletion. 
3 'RACE analysis of WS-212 

3' RACE was completed essentially as described 

15 (European Polycystic Kidney Disease Consortium (1994)). 
Reverse transcription was performed with 5pg total RNA 
with 0.5pg of the hybrid dT 17 adapter primer using 
conditions previously described (Fronman et al. , 
(1988)). A specific 3* RACE product was amplified with 

20 the primer F5 adn adapter primer in O.SmM MgCl 2 with 
the program: 57°C, 60s; 72°C, 15 minutes and 30 cycles 
of 95°C / 40s; 57°C / 60s; 72°C / 60s plus 72°C / 10 
minutes. The amplified product was cloned using the TA 
cloning system (Invitrogen) and sequenced by 

25 conventional methods. 
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CLAIMS 

1. An isolated, purified or recombinant nucleic acid 

sequence comprising 

(a) a PKD1 gene or its complementary strand, 

5 (b) a sequence substantially homologous to, or 

capable of hybridising to, a substantial portion of a 
molecule defined in (a) above, 

(c) a fragment of a molecule defined in (a) or (b) 

above . 

10 2. A sequence according to claim 1, wherein the PKD1 

gene has the partial nucleic acid sequence according to 
Figure 7 and/or 10. 

3. A sequence according to claim 1 or claim 2 comprising 

a DNA molecule selected from: 
15 < a > a PKD1 gene or its complementary strand, 

(b) a sequence substantially homologous to, or 
capable of hybridising to, a substantial portion of a 
molecule defined in (a) above, 

(c) a molecule coding for a polypeptide having the 
20 partial sequence of Figure 7, 

(d) genomic DNA corresponding to a molecule in (a) 
above; and 

(e) a fragment of a molecule defined in any of (a), 
(b), (c) or (d) above, 

25 4 • A nucleic acid sequence comprising a mutant PKD1 

gene, selected from those wherein: - 

(a) [OX114] base pairs 1746-2192 as defined in 
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Figure 7 are deleted (446bp); 

(b) [OX32] base pairs 3696-3831 as defined in 
Figure 7 are deleted by a splicing defect; 

(c) [OX875] about 5-5kb flanked by the two Xbal 
5 sites shown in Figure 3a are deleted and the EcoRl site 

separating the CW10 (41kb) and JH1 (18kb) sites is thereby 
absent ; and 

(d) [WS53] about lOOkb extending between the JH1 
and CW21 and the SM6 and JH17 sites shown in Figure 6 and 

10 the PKD1 gene is thereby absent. 

5. A nucleic acid sequence comprising a mutant PKD1 gene 

selected from those wherein- 

(a) [461] abpout 18bp are deleted in the 75bp 
intron amplified by the primer pair 3A3C insert at position 

15 3696 of the 3* sequence as shown in Figure 11; 

(b) [OX1054] about 20bp are deleted in the 75bp 
intron amplified by the primer pair 3A3C insert at position 
3696 of the 3' sequence as shown in Figure 11; 

(c) [WS212] about 75kb are deleted between SM9-CW9 
20 distally and the PKD1 3 f UTR proximally as shown in Figure 

12; 

(d) [WS-215] about 160kb are deleted between CW20 
and CW10-CW36 as shown in Figure 12; 

(e) [WS-227] about 50kb are deleted between CW20 
25 and JH11 as shown in Figure 12; 

(f) [WS-219] about 27kb are deleted between JH1 and 
JH6 as shown in Figure 12; and 

(g) [WS-250] about 160kb are deleted betwenn WC20 
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and BLu24 as shown in Figure 12. 

(h) [WS194] a deletion of about 65kb between CW20 and 
CW10. 

6. An RNA molecule comprising an RNA sequence 
5 corresponding to a DNA sequence according to any of claims 

1 to 5. 

7. An RNA molecule according to claim 6, wherein the 
molecule is the transcript referenced PKD1 and identifiable 
from the restriction map of Figure 3a and having a sequence 

10 of about 14 KB. 

8. A nucleic acid probe having a sequence according to 
any of the preceding claims and optionally including a 
label . 

9. A nucleic acid sequence according to any preceding 
15 claim, wherein the nucleic acid sequence encoding PKD1 is 

operably linked to transcriptional and/or translational 
expression signals. 

10. An isolated, purified or recombinant polypeptide 
comprising a PKD1 protein or a mutant or variant thereof or 

20 encoded by a sequence according to any of claims 1 to 9 or 
a variant thereof having substantially the same activity as 
the PKD1 protein. 

11. A polypeptide according to claim 10, wherein the 
PKD1 protein has the amino acid sequence according to the 

25 partial amino acid sequence of Figure 7 and/or Figure 10. 

12. An anti-PKDl antibody or a labelled anti-PKDl 
antibody. 

13. A method for screening a subject to determine 
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whether said subject: is a PKD1- associated disorder carrier 
or a patient having a PKD1- associated disorder, which method 
comprises detecting the presence of and/or evaluating the 
characteristics of PKD1 DNA, PKD1 RNA and/or PKD1 

5 polypeptide in a biological sample from said patient. 

14. A method according to claim 13 which is or includes 

detecting and/or evaluating whether the PKD1 DNA is mutated, 
deleted, aberrant or otherwise abnormal, or is not 
expressing normal PKD1 protein. 

0 15. A method according to claim 13 or claim 14, wherein 

the detection and/or evaluation includes the step of 
comparing the results thereof with results obtained using a 
mutated sequence according to claim 4 or claim 5 . 

16. A method according to any of claims 13 to 15, 
5 wherein said screening includes applying a nucleic acid 

amplification process to said sample to amplify a fragment 
of the PKD1 DNA or cDNA corresponding to the PKD1 RNA. 

17. A method according to claim 16, wherein said nucleic 
acid amplification process uses at least one of the 

0 following sets of primers as identified herein:- 
AH3 F9 : AH3 B7 
3A3 CI : 3A3 C2 
AH 4 F2 : JH14 B3 

18. A method according to any of claims 13 to 17 which 
5 comprises digesting said sample to EcoRl fragments and 

hybridising with a DNA probe which hybridises to the EcoRl 
fragment identified (A) in Figure 3(a). 

19. A method according to claim 18, wherein said DNA 
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probe comprises the DNA probe CW10 identified herein. 

20. A method according to any of claims 13 to 17 which 
comprises digesting said sample to provide BamHl fragments 
hybridising with a DNA probe which hybridises to the BamHl 

5 fragment identified (B) in Figure 3(a). 

21. A method according to claim 20 , wherein said DNA 
probe comprises the DNA probe 1A1H.6 identified herein. 

22. A vector (such as Bluscript (available from 
Stratagene)) comprising the nucleic acid sequence of any of 

10 claims 1 to 9. 

23. A host cell (such as E. coli strain SL-1 Blue 
(available from Stratgene)) transfected or transformed with 
a vector according to claim 22. 

24. The use of a vector according to claim 23 or a 
15 nucleic acid sequence according to any of claims 1 to 11 in 

gene therapy and/or in the preparation of an agent for 
treating or preventing a PKDl-associated disorder. 

25. A method of treating or preventing a PKDl- 
associated disorder which method comprises administering to 

20 a patient in need thereof a functional PKD1 gene to affected 
cells in a manner that permits expression of PKD1 protein 
therein and/or a transcript produced from a mutated 
chromosome such as the deleted WS-212 chromosome which is 
capable of expressing functional PKD1 protein therein. 

25 26. A diagnostic kit for carrying out a method according 

to any of claims 13 to 21, comprising nucleic acid primers 
for amplifying a fragment of a sequence according to any of 
Claims 1 to 9. 
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27. A diagnostic kit: according -to claim 26 , wherein the 
nucleic acid primers comprise at least one of the following 
sets: 

AH3 F9 : AH3 B7 
5 3A3 CI : 3A3 C2 

AH4 F2 : JH14 B3 

28. A diagnostic kit for carrying out a method according 
to claim 18, including one or more substances for digesting 
a sample to provide EcoRI fragments and a DNA probe as 

10 defined in claim 19. 

29 . A diagnostic kit for carrying out a method according 
to claim 20, including one or more substances for digesting 
a sample to provide BamHl fragments and a DNA probe as 
defined in claim 21. 

15 30. A diagnostic kit for carrying out a method for 

determining whether said subject is a FKDl-associated 
disorder carrier or a patient having a PKD1- associated 
disorder, which includes a nucleic acid probe capable of 
hybridising to a sequence according to any of claims 1 to 

20 11. 
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1 LNEEPLTLAGEEIVAQGKRS 20 

61 GAOOOGOGGftGOCTX3CIGTQCTATGGCX3GOGOOOCAG 120 

21 DPRSLLCYGGAPGPGCHFSI 40 

41 p DC E AG A CI F Tt ^^ 50° 

181 GTGGACTCXAATOOCTITOa^nTGGCTATA 240 

61 VDSNPFPFGYISNYTVSTKV 80 

241 GOCTOGATGGCATIXXA^^ 3QQ 

81 ASMAFQTQAGAQIPIERLAS 100 

301 GAGD3GGCCATCACOGTTGAAGGTOCXX^ACAAC^ 360 

101 ERAITVKV PNNSDWAARGHR 120 

361 AQCIXDaG0CAACia3goc ^. im ^^ 

121 SSANSANSVVVQPQASVGAV 140 



480 

T L 160 



421 GTCACmtGGACAGQO^AAQOCTG^ 

141 VTLDSSNPAAGLHLQLNY 

481 CTGGACGGCCACTAOTrGTCrcAGGAACXn^ 540 

161 LDGHYLSEEPEPYLAVYLHS 180 

541 GAGOXCXX300CAATGAGCAO\ACIGCiaGG^ aqq 

181 EPRPNEHNCSASRRIRPESL 200 

601 (XGGGFGCK2>CCAOOGG^ 660 

201 QGADHRPYTFFISPGSRDPA 220 

661 Q2GACTTACCATCTGAAOCTOT 720 

221 GSYHLNLSSHFR. WSALQVSW 240 

721 GSCCTCTACACXJrcrXTnjIGCCACT 780 

241 GLYTSLCQYFSEEDMVWRTE 250 

781 GGGCTOCTGOOOCTOGAGGAGACXnOXXDQCX 840 

261 GLLPLEETSPRQAVCLTRH 



L 230 



900 

P E 300 



841 ACXDGOCTimXXXX^GCCICITaG^ 

281 TAFGAS LFVPPSHVRFVF 

901 COSACAGOGGATGTAAACTACATOGTCATGCrrGAC^ q 6 o 

301 PTADVNY1VMLTCAVCLVTY 320 

961 ATGGTCATGG00G0CATCXinX3CACAAGCTOG^^ 1020 

321 MVMAAILHKLDQLDASRGRA 340 

1021 ATCXXnTTCTGTGGQCAGaSQQSOOGCITCAA^ 1080 

341 IPFCGQRGRFKYEILVKTGW 360 

1081 GGCXxsoc^crcAQGrr^^ 1140 

361 GRGSGTTA-HVGIMLYGVDSR 380 

H41 AGCX3QCXACXDGCACCTGGAOGGCX5ACAGAG^ 120 0 

381 SGHRHLDGDRAFHRNSLDIF 400 

1201 OGGATOGOCAOCCaQC^CAGOCTOQGTAGO^ 1260 

401RIATPHSLGSVWKIRVWHDN 420 

Figure 7 
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12 421 K^G 00 ^ V I^^R^ 440° 

1321 OGCAGOQCXTTT^ 1380 

441 RS AFFLVNDWLSVETEANGG 460 

1381 CTGGTGGAG^GGAGGTGCT^^ 140 

461 LVEKEVLAASDAALLRFRRL 480 

1441 CTGGTGGCTGAGCSTX^^ 1500 

481 L VAE LQRGFFDKHIWLS 1WD 500 

1501 CGGCXXSOCra^AQCr^ 1560 

501 RPPRSRFTR1QRATCCVLL1 520 

1561 TOOCICT^^ 1620 

1621 AOGGGGCATGTGTCX^GGCTGA^ 1680 

541 TGHVSRLSP LSVDTVAVGLV 560 

1681 TCX2AGCXnX3C7ITGTCTATO 1740 

561 SSVVVYPVYLAILFLFRMSR 580 

1741 AGCAAGGTG3CTGGGAGCXXXSAG0CO 1800 

581 SKVAGSPSPTPAGQQVLDID 600 

1801 ptxttccrGGPiOra^ i860 

601 SCLDSSVLDSSFLTFSGLHA 620 

1861 GAGGOCTITCTIGGACAGAT^ 1920 

621 EAFVGQMKSDLFLDDSKSLV 640 

1921 TGCTGGCCX^COSGCGAG 1980 

641 CWPSGEGTLSWPDLLSDPSI 660 

1981 GTX3GGTAQCAATCTC^^ 2040 

661 VGSNLRQLARGQAGHGLGPE 680 

2041 G^GGACGGCTTCTOOCnGGCCAGa 2100 

681 EDGFSLAS PYSPAKS FSASD 700 

2101 GAAGACCTGATOOAGGAGGTC^ 2160 

701 EDLIQQVLAEGVSSPAPTQD 720 

2161 ACCCACATXX^P^ 2220 

721 THMETDLLSSLSSTPGEKTE 740 

2221 ACX3CTI\3GCjGCIX^ 2280 

741 TLALORLGE LGP PSPGLNWE 760 

2281 CAGCCCCAGGC^GCX^GGCTGTOCAGGA^ 2340 

761 QPQAARLSRTGLVEGLRKRL 780 

2341 CTGCXXSQXTQGTGTIGCX^^ 2400 

781 LPAWCASLAHGLSLLLVAVA 800 

2401 GTOGCTGTCTCAGGCTGGGTa ' 2460 

801 VAVSGWVGASFPPGVSVAWL 820 

2461 CTGTOCAQCAGOXCAQC^ 2520 

821 LSSSASFLASFLGWEPLKVL 840 

Figure 7 cont'd 
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2521 CTQ3AAQOCXnX^ACT 2580 

841 LEALYFSL VAKRLHPDEDDT 860 

2581 CTGCTAGAGAGOOOQGCJEX^^ 2640 

861 LVESPAVT.PVSARVPRVRPP 880 

2641 CAOGGCTTTGCACTCnT^ 2700 

881 HGFALF LAKEEARKVKRLHG 900 

2701 ATGC7TCCX3GAGOCriXXTQ 2760 

901 MLRSLLVYMLFLLVTLLASY 920 

2761 GQQGATGOC^CATQCX^^ 2820 

921 GDASCHGHAYRLQSAIKQEL 940 

941 H ^R 9 ^^ 960° 

2 961 v K3 L IX ?^^ 980° 

2941 CXXX^GCTTOOQGCT^^ 3000 

981 RQVRLQEALYPDPPGPRVHT 1000 

3001 TGCTOGGOOGCAGGAGGCTITC^ 3060 

1001 CSAAGGFSTSDYDVGWESPH 1020 

3061 AATOGCTCXXSCXSAO?^ 3120 

1021 NGSGTWAYSAPDLLGAWS VJG 1040 

3121 TXDCTGIGOOGTGrTATGACAGOGGGGQC^ 3180 

1041 SCAVYDSGGYVQELGLS LEE 1060 

3181 AGOOGOGAOOGGCIXXXXTITOCTGC^^ 3240 

1061 SRDRLRFLQLHNWLDNRSRA 1080 

3241 GTGTITCCTGGAGCTCAO^ 3300 

1081 VFLELTRYSPAVGLHAAVTL 1100 

3301 CKXTOGAGTITC 3360 

1101 RLE FPAAGRALAALSVRPFA 1120 

3361 CTGOGCOSCX^CAGOC^^ 3420 

1121 LRRLSAGLSLPLLTSVCLLL 1140 

3421 TTOXXDGIXXACT^ 3480 

1141 FAVHFAVAEARTWHREGRWR 1160 

3481 GIGCIGOGGCIXDGGAGOCTGGGOGOGC^^ 3540 

1161 VLRLGAWARWLLVALTAATA 1180 

3541 GTGGTAOGCXJTOQOC^ 3600 

1181 LVR LA QLGAADRQWTRFVRG 1200 

3601 CGCXXXXX3CXXXn^^ 3660 

1201 RPRRFTSFDQVAHVSSAARG 1220 

3661 CTGGOGGCCTX^XTGCTCnTC^ 3720 

1221 LAASLLFLLLVKAAQHVRFV 1240 

3721 CXXX^CTX^GTCXX^ 3780 

1241 RQWSVFGKTLCRALPE LLGV 1260 

Figure 7 cont'd 
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3781 AOCTTGGQCETC 3840 

1261 TLGLVVLGVAYAQLAI LLVS 1280 

3841 TOCTX?TCriX3^^ 3900 

1281 SCVDSLWSVAQALLVLCPGT 1300 

1301 G* 3 !?^^ 1320 

3961 CTXZTI\3QGCACTGCX3QCro 4020 

1321 LWALRLWGALRLGAVI L R W R 1340 

4021 TACCADG0CTI\30GTGGAGAGCTGTA^ 4080 

1341 YHALRGELYRPAWEPQDYEM 1360 

4081 GTCGAGTIXTTTOCrc^ 4140 

1361 VELFLRRLRLWMGLSKVKEF 1380 

4141 OGOCACAAACTXXX3CJITTG^GGGA 4200 

1381 RHKVRFEGMEPLPSRSSRGS 1400 

4201 AAGCTATQX03GATGrroOCOOC^ 4260 

1401 KVSPDVPPPSAGSDASHPST 1420 

4261 TCCTCX^GCX^GC^X^ 4320 

1421 SSSQLDGLSVSLGRLGTRCE 1440 

4321 CCTGAGCCCTCCCOJO^^ 4380 

'' 1441 PEPSRLQAVFEALLTQFDRL 1460 

4381 PJ*CCAG3CCPCfi^^ 4440 

1461 NQATEDVYQLEQQLHSLQGR 1480 

4441 AGGAGC^CXXDGGGCXXrOG^ 4500 

1481 RSSRAPAGSSRG PSPGLRPA 1500 

4501 OIGCCCAOCXX3CCTJ^^^ 4560 

1501 LPSRLARASRGVDLATGPSR 1520 

1521 ^P C ' I S 0 ^^ 1540 

4621 GGTGGGOCXnGGAGTOGGAGTC^^ 4689 

0 

1541 GGPWSRSGHRSVLLSAA VKA 1560 

4681 GAGGGOCAGGCAGAATGGCTGCAOGrrAGCT 4740 

1561 EGQAEWLHVGSPESRQGHLS 1580 

1581 ^"^C^^ F K 1600 

4801 CXX^QCTCmiTGQGAAa^ 4860 

1601 PSSLGKDTAVLDGF 1620 

4861 TTTATTroaXX^GTO^TC^ 4920 

4921 GTCCOXACTGCTAAGG^ 4980 

4981 OOOOTAAGJVATTPCCTCtt 5040 

5041 . TCGTCTCAGTAATITO - 5100 

F igure 7 Cont'd 
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5101 TAGQGCTGAGGGGOCTXXO^ 5160 

5161 TCJK3GrTCGCX?ITATOGCAGi^^ 5220 

5221 CTGGGGGCACAGCTCTC^ 5280 

5281 TGOOCX^V3GCCAGC7rAGCAAGAG 5340 

5341 CTAGCAGGACTAGGCATGTCT^^ 5400 

5401 GGGCHGGCTUXAGGGIGGAG^ 5460 

5461 GCGACIX?TOCTGTATGGaC^^ 5520 

5521 TGTCTAOC^CITC^^ 5580 

5581 AOCAAGCAGACAAACTCAAT^^ 5631 

1A1H0.6 

1 AAGCTTOQCA OCATCAAGG3 CCAGTICAAC TITCICCAOG TCATOGPCAC COOGCTGGAC 

61 TAQGAGTGCA AOCTGGTOIC OGTOCAGTGC AGGAAAGACA TGGAG3Q0CT TGIGGACACC 

121 AGGGTOG0CA ASKLOBIG1C TGftOOGCAAC CTO00CTT0G TGG0O0GOCA GATGGGOCTG 

181 CAOGCAAATA TOGGGICACA GGK3CATCAT AGOOGGTOCA AOGOCAOOGA TATCTAOCEC 

241 TOCAAGTOGA TTGOOOGGTT QDQGCACATC AAGOGQCTOC GOCAGGGGAT CIGOGAGGAA 

301 GOGGGCTACT (XAAGOOCAG OCTAQCTCTG GTOCAOGCTC OGIGOCATAG CAAAG3GOGT 

361 GCACAGACIC CAGOOGAGGC CACAOCTGGC TATGAGGIX3G GGCAGGGGAA GOGOGICATC 

421 TOCTOGGTTGG AGGACITCAC OGAGITIGIG TGAGGG09GG GOOGIXXCTC CIGCACIGGC 

481 CTTGGAOGGTr MTOGCTGTC ACTGAAATAA ATAAAGTOCT GAOOOCAGIG CACAGACATA 

541 GAGGCACAGA TIGC 

Figure 8 
W^IOF 

1 GICOGOQCTC Q2AOCTACQG TICT QGPGIG TGIGAGAOGT GGGGGGCTGG GAAGIGTTG3 

61 CAGAOGGOGA GTAOGIOCTC AL'IULTJLTIG T1CT1T1GAC CTAAGGIGGC GAGIGGCACT 

121 GCTGAGITOC GCTCAGIGOC OGGOGTGATG TGOGAOG00C GTOC ATlUl ' l ' GCIUTTAGGr 

181 GGIGGOGGIG TG 

CW10R 

1 AGGCAGGICT OOOGCAGGAG CAGGGGAGAG GCAOOCAAGG T 
Figure 9 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: (Ccnpare Fig.l) 

C GGC GCC GOC TGC OGC CTC AAC TGC TOG GGC OGC GOG CTG OGG ACG 46 
Gly Ala Ala Cys Arg Val Asn Cys Ser Gly Arg Gly Leu Arg Thr 
15 10 15 

CTC GGT OOC GOG CTG OGC ATC CCC GOG GAC GCC ACA GOG CTA GAC GTC 94 
Leu Gly Pro Ala Leu Arg lie Pro Ala Asp Ala Thr Ala Leu Asp Val 
20 25 30 

TCC CAC AAC CTG CTC OGG GOG CTG GAC GTT GGG CTC CTG GOG AAC CTC 142 
Ser His Asn Leu Leu Arg Ala Leu Asp Val Gly Leu Leu Ala Asn Leu 
35 40 45 

TOG GOG CTG GCA GAG CTG GAT ATA AGC AAC AAC AAG ATT TCT AOG TTA 190 
Ser Ala Leu Ala Glu Leu Asp lie Ser Asn Asn Lys lie Ser Thr Leu 
50 55 ^ 60 

GAA GAA GGA ATA TTT GCT AAT TTA TTT AAT TTA ACT GAA ATA AAC CTG 238 
Glu Glu Glv lie Phe Ala Asn Leu Phe Asn Leu Ser Glu lie Asn Leu 
65 70 75 

ACT GGG AAC COG TTT GAG TGT GAC TGT GGC CTG GOG TOG CTG O0G OGA 286 
Ser Gly Asn Pro Phe Glu Cys Asp Cys Gly Leu Ala Trp Leu Pro Arg 
80 85 90 95 

TGG GOG GAG GAG CAG CAG GTG OGG CTG GTG CAG OOC GAG GCA GOC AOG 334 
Trp Ala Glu Glu Gin Gin Val Arg Val Val Glxi Pro Glu Ala Ala Thr 
100 105 110 

TGT GCT GGG OCT GGC TOC CTG GOT GGC CAG OCT CTG CTT GGC ATC OOC 332 
Cys Ala Gly Pro Gly Ser Leu Ala Gly Gin Pro Leu Leu Gly lie Pro 
115 120 125 

TTG CTG GAC ACT GGC TCT GCT GAG GAG TAT GTC GOC TGC CTC OCT GAC 430 
Leu Leu Asp Ser Gly Cys Gly Glu Glu Tyr Val Ala Cys Leu Pro Asp 
130 ~" ^ 135 140 

AAC AGC TCA GGC AOC CTG GCA GCA GTG TOC TTT TCA GCT GOC CAC GAA «C78 
Asn Ser Ser Gly Thr Val Ala Ala Val Ser Phe Ser Ala Ala His Glu 
145 150 155 

GGC CTG CTT CAG CCA GAG GOC TGC AGC GOC TTC TGC TIC TOC AOC GGC 526 
Gly Leu Leu Gin Pro Glu Ala Cys Ser Ala Phe Cys Hie Ser Thr Gly 
160 165 170 175 

CAG GGC CTC GCA GCC CTC TOG GAG CAG GGC TGG TGC CTG TCT GGG GCG 574 
Gin Glv Leu Ala Ala Leu Ser Glu Gin Gly Trp Cys Leu Cys Gly Ala 
180 185 ~ 190 

GOC CAG CCC TOC ACT GOC TOC TIT GOC TGC CTG TOC CTC TGC TOC GGC 622 
Ala Gin Pro Ser Ser Ala Ser Phe Ala Cys Leu Ser Leu Cys Ser Gly 
195 200 205 

OOC CCG OCA OCT CCT GOC CCC AOC TGT AGG GGC OOC AOC CTC CTC CAG 670 
Pro Pro Pro Pro Pro Ala Pro Thr Cys Arg Gly Pro Thr Leu Leu Gin 
210 215 220 

CAC GTC TTC OCT GOC TOC OCA GGG GOC AOC CTG GTG GGG OOC CAC GGA 718 
His Val Phe Pro Ala Ser Pro Gly Ala Thr Leu Val Gly Pro His Gly 
225 230 235 
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OCT CTG GOC TCT GQC CAG CTA GCA GOC TTC CAC ATC GCT GOC COG CTC 766 
Pro Leu Ala Ser Gly Gin Leu Ala Ala Phe His He Ala Ala Pro Leu 
240 245 250 255 

OCT GTC ACT GOC ACA OQC TGG GAC TTC GGA GAC GGC TCC GQC GAG GTG 814 
Pro Val Thr Ala Thr Arg Trp Asp Phe Gly Asp Gly Ser Ala Glu Val 
260 265 270 

GAT GOC GCT GGG COG GCT GOC TOG CAT CGC TAT GTG CTG OCT GGG OGC 862 
Asp Ala Ala Gly Pro Ala Ala Ser His Arg Tyr Val Leu Pro Gly Am 
275 280 285 

TAT CAC GTG AOS GOC GTG CTG GOC CTG GGG GOC GGC TCA GOC CTG CTG 910 
Tyr His Val Thr Ala Val Leu Ala Leu Gly Ala Gly Ser Ala Leu Leu 
290 295 " 300 

GGG ACA GAC GTG CAG GTG GAA GOG GCA OCT GOC GOC CTG GAG CTC GTG 958 
Gly Thr Asp Val Gin Val Glu Ala Ala Pro Ala Ala Leu Glu Leu Val 
305 310 315 

TGC COG TCC TOG GTG CAG AGT GAC GAG AGC CTT GAC CTC AGC ATC CAG 1006 
Cys Pro Ser Ser Val Gin Ser Asp Glu Ser Leu Asp Leu Ser He Gin 
320 325 330 335 

AAC CGC GGT GGT TCA GGC CTG GAG GOC GOC TAG AGC ATC GTG GOC CTG 1054 
Asn Arg Gly Gly Ser Gly Leu Glu Ala Ala Tyr Ser He Val Ala Leu 
340 345 350 

GGC GAG GAG COG GOC GGA GOG GTG CAC COG CTC TGC GOC TOG GAC AGS 1102 
Gly Glu Glu Pro Ala Arg Ala Val His Pro Leu Cys Pro Ser Asp Thr 
355 360 365 

GAG ATC TTC OCT GGC AAC GGG CAC TGC TAG CGC CTG GTG GTG GAG AAG 1150 
Glu He Phe Pro Gly Asn Gly His Cys Tyr Arg Leu Val Val Glu Lvs 
370 375 380 

GOG GOC TGG CTG CAG GOG CAG GAG CAG TGT CAG GOC TGG GOC GGG GOC HQR 
Ala Ala Trp Leu Gin Ala Gin Glu Gin Cys Gin Ala Trp Ala Gly Ala 
385 390 395 

GOC CTG GCA ATG GTG GAC AGT CCC GOC GTG CAG CGC TTC CTG GTC TCC 1246 
Ala Leu Ala Met: Val Asp Ser Pro Ala Val Gin Arg Phe Leu Val Ser 
400 405 410 ~ 415 

CGG GTC ACC AGG AGC CTA GAC GTG TGG ATC GGC TTC TOG ACT GTG CAG 1294 
Arg Val Thr Arg Ser Leu Asp Val Trp lie Gly Phe Ser Thr Val Gin 
420 425 430 

GGG GTG GAG GTG GGC CCA GOG COG CAG GGC GAG GOC TTC AGC CTG GAG 1342 
Gly Val Glu Val Gly Pro Ala Pro Gin Gly Glu Ala Phe Ser Leu Glu 
435 440 445 

AGC TGC CAG AAC TOG CTG CCC GGG GAG CCA CAC CCA GCC ACA GOC GAG 
Ser Cys Gin Asn Trp Leu Pro Gly Glu Pro His Pro Ala Thr Ala Glu 
450 455 460 

CAC TGC GTC CGG CTC GGG CCC ACC GGG TGG TGT AAC ACC GAC CTG TGC 1438 
His Cys Val Arg Leu Gly Pro Thr Gly Trp Cys Asn Thr Asp Leu Cys 
465 470 475 



1390 
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TCA GOG (DOG CAC AGO TAG GTC TGC GAG CTG CAG OOC GGA GGC OCA GTG I486 
Ser Ala Pro His Ser Tyr Val Cys Glu Leu Gin Pro Gly Gly Pro Val 
480 485 490 ~ * 495 

CAG GAT GGC GAG AAC CTC CTC GTG GGA GOG OOC ACT GGG GAC CTG CAG 1534 
Gin Asp Ala Glu Asn Leu Leu Val Gly Ala Pro Ser Gly Asp Leu Gin 
500 505 510 

GGA OOC CTG ACG OCT CTG GCA CAG CAG GAC GGC CTC TCA GOC COG CAC 1582 
Gly Pro Leu Thr Pro Leu Ala Gin Gin Asp Gly Leu Ser Ala Pro His 
515 520 



GAG OOC GTG GAG GTC ATG GTA TIC COG GGC CTG OCT CTG AGC OCT GAA 1630 
Glu Pro Val Glu Val Met Val Phe Pro Gly Leu Arg Leu Ser Arq Glu 
530 535 540 

GOC TTC CTC AOC AOG GOC GAA TIT GGG AOC CAG GAG CTC OGG CGG OOC 1678 
Ala Phe Leu Thr Thr Ala Glu Phe Gly Thr Gin Glu Leu Arg Arg Pro 
545 550 555 

GOC CAG CTG OGG CTG CAG GTG TAC OGG CTC CTC AGC ACA GCA GGG AOC 1726 
Ala Gin Leu Arg Leu Gin Val Tyr Arg Leu Leu Ser Thr Ala Gly Thr 
560 565 570 575 

COG GAG AAC GGC AGC GAG OCT GAG AGC AGG TOC COG GAC AAC AGG AOC 1774 
Pro Glu Asn Gly Ser Glu Pro Glu Ser Arg Ser Pro Asp Asn Arg Thr 
580 585 ^ 590 

CAG CTG GOC OOC GOG TGC ATG OCA GGG GGA OGC TOG TGC OCT GGA GOC 1822 
Gin Leu Ala Pro Ala Cys Met: Pro Gly Gly Arg Trp Cys Pro Gly Ala 
595 600 ~ 605 

AAC ATC TGC TIG 00G CTG GAC GOC TCT TGC CAC COC CAG GOC TGC GOC 1870 
Asn lie Cys Leu Pro Leu Asp Ala Ser Cys His Pro Gin Ala Cys Ala 
610 615 620 

AAT GGC TGC AOG TCA GGG OCA GGG CTA OOC GGG GOC OOC TAT GOG CTA 1918 
Asn Gly Cys Thr Ser Gly Pro Gly Leu Pro Gly Ala Pro Tyr Ala Leu 
625 630 635 

TGG AGA GAG TTC CTC TTC TOC GTT GOC GOG GGG OOC OOC GOG CAG TAC 1966 
Trp Arg Glu Phe Leu Phe Ser Val Ala Ala Gly Pro Pro Ala Gin Tyr 
640 645 650 655 

TOG GTC AOC CTC CAC GGC CAG GAT GTC CTC ATG CTC OCT GCT GAC CTC 2014 
Ser Val Thr Leu His Gly Gin Asp Val Leu Met Leu Pro Gly Asp Leu 
660 665 670 

GTT GGC TTG CAG CAC GAC GCT GGC OCT GGC GOC CTC CTG CAC TGC TOG 2062 
Val Gly Leu Gin His Asp Ala Gly Pro Gly Ala Leu Leu His Cys Ser 
675 680 685 

COG GCT OOC GGC CAC OCT GCT OOC CAG GOC COG TAC CTC TOC GOC AAC 2110 
Pro Ala Pro Gly His Pro Gly Pro Gin Ala Pro Tyr Leu Ser Ala Asn 
690 695 700 

GOC TOG TCA TGG CTG OOC CAC TTG OCA GOC CAG CTG GAG GGC ACT TGG 2158 
Ala Ser Ser Trp Leu Pro His Leu Pro Ala Gin Leu Glu Gly Thr Trp 
705 710 715 



SUBSTITUTE SHEET (RULf 26) 



DOCID: <WO 951B225A1_I_> 



WO 95/18225 



FCT/GB94/02822 



15/58 

GOC TGC OCT GOC TGT GOC CTG OGG CTG CTT GCA GOC AOG GAA CAG CTC 2206 
Ala Cys Pro Ala Cys Ala Leu Arg Leu Leu Ala Ala Thr Glu Gin Leu 
720 725 730 735 

AOC CTG CTG CTG GOC TTG AGG 00C AAC OCT GGA CTG OGG ATG OCT GOG 2254 
Thr Val Leu Leu Gly Leu Arg Pro Asn Pro Gly Leu Arg Met Pro Gly 
740 ~ 745 750 

CGC TAT GAG GTC COG OCA GAG GIG GGC AAT GGC GTG TOC AGG CAC AAC 2302 
Arg Tyr Glu Val Arg Ala Glu Val Gly Asn Gly Val Ser Arg His Asn 
755 760 765 

CTC TOC TOC AOC TTT GAC GTG GTC TOC OCA GTG GOT GGG CTG OGG CTC 2350 
Leu Ser Cys Ser Phe Asp Val Val Ser Pro Val Ala Gly Leu Aro Val 
770 775 780 

ATC TAC OCT GOC COC CGC GAC GGC CGC CTC TAC CTG OOC AOC AAC GGC 2398 
lie Tyr Fro Ala Pro Arg Asp Gly Arg leu Tyr Val Pro Thr Asn Gly 
785 790 ~ 795 

TCA GOC TTG GIG CTC CAG GTG GAC TCT GCT GOC AAC GOC AOG GOC AOG 2446 
Ser Ala Leu Val Leu Gin Val Asp Ser Gly Ala Asn Ala Thr Ala Thr 
800 805 810 815 

GCT CGC TOG OCT GGG GGC ACT GTC AGC GOC CGC TTT GAG AAT GTC TOC 2494 
Ala Arg Trp Pro Gly Gly Ser Val Ser Ala Arg Hie Glu Asn Val Cys 
820 825 830 

OCT GOC CTG GTG GOC AOC TIC GTG OOC GGC TOC OOC TOG GAG AOC AAC 2542 
Pro Ala Leu Val Ala Thr Phe Val Pro Gly Cys Pro Trp Glu Thr Asn 
835 840 845 

GAT AOC CTG TIC TCA GTPG GTA GCA CTG COG TOG CTC ACT GAG GGG GAG 2590 
Asp Thr Leu Phe Ser Val Val Ala Leu Pro Trp Leu Ser Glu Gly Glu 
850 855 ~ 860 . 

CAC GTG GTG GAC GTG CTG GTG GAA AAC AGC GOC AGC GGG GOC AAC CIC 2633 
His Val Val Asp Val Val Val Glu Asn Ser Ala Ser Arg Ala Asn Leu 
865 870 875 

AGC CTG OGG GTG AOG GOG GAG GAG OOC ATC TCT GGC CTC CGC GOC AOG 2686 
Ser Leu Arg Val Thr Ala Glu Glu Pro He Cys Gly Leu Aro Ala Thr 
880 885 890 895 

OOC AGC OOC GAG GOC OCT GTA CTG CAG GGA GTC CTA GTG AGG TAC AGC 2734 
Pro Ser Pro Glu Ala Arg Val leu Gin Gly Val Leu Val Arg Tyr Ser 
900 905 910 

OOC GTG GTG GAG GOC GGC TOG GAC ATG GTC TIC GGG TGG AOC ATC AAC 2782 
Pro Val Val Glu Ala Gly Ser Asp Met: Val Phe Aro Trp Thr He Asn 
915 920 925 

GAC AAG CAG TOC CTG AOC TTC CAG AAC GTG GTC TTC AAT GTC ATT TAT 2830 
Asp Lys Gin Ser Leu Thr Phe Gin Asn Val Val Phe Asn Val He Tyr 
930 935 940 

CAG AGC GOG GOG GTC TIC AAG CTC TCA CTG AOG GOC TOC AAC CAC GTG 2878 
Gin Ser /Via Ala Val Phe Lys Leu Ser Leu Thr Ala Ser Asn His Val 
945 950 955 
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AGC AAC GTC AOC GTG AAC TAC AAC GTA AOC GTG GAG GGG ATG AAC AGG 2926 
Ser Asn Val Thr Val Asn Tyr Asn Val Thr Val Glu Arg Met Asn Arg 
960 965 ~ 970 975 

ATG CAG GOT CTG CAG GTC TOC ACA GTG COG GOC GTG CTG TOC COC AAT 2974 
Met Gin Gly Leu Gin Val Ser Thr Val Pro Ala Val Leu Ser Pro Asn 
980 985 990 

GOC ACA CTG GTA CTG AOG GGT GGT GTG CTG GTG GAC TCA GOT GTG GAG 3022 
Ala Thr Leu Val Leu Thr Gly Gly Val Leu Val Asp Ser Ala Val Glu 
995 1000 1005 

GIG Q0C TTC CTG TGG AAC TTT GGG GAT GGG GAG CAG GOC CTC CAC CAG 3070 
Val Ala Hie Leu Trp Asn Phe Gly Asp Gly Glu Gin Ala Leu His Gin 
1010 1015 1020 

TTC CAG OCT COG TAC AAC GAG TOC TTC COG GIT OCA GAC COC TOG GTG 3118 
Phe Gin Pro Pro Tyr Asn Glu Ser Phe Pro Val Pro Asp Pro Ser Val 
1025 1030 1035 

GOC CAG GTG CTG GIG GAG CAC AAT GTC ATG CAC AOC TAC GCT GOC OCA 3166 
Ala Gin Val Leu Val Glu His Asn Val Met His Thr Tyr Ala Ala Pro 
1040 1045 1050 1055 

GGT GAG TAC CTC CTG AOC GTG CTG GCA TCT AAT GOC TTC GAG AAC CTG 3214 
Gly Glu Tyr Leu Leu Thr Val Leu Ala Ser Asn Ala Phe Glu Asn Leu 
1060 1065 1070 

AOG CAG CAG GTG OCT GTG AOC GTG OGC GOC TOC CTG OOC TOC GTG GOT 3262 
Thr Gin Gin Val Pro Val Ser Val Arg Ala Ser Leu Pro Ser Val Ala 
1075 1080 1085 

GTG GGT GTG ACT GAC GGC GTC CTG GTG GOC GGC OGG OOC GTC AOC TTC 3310 
Val Gly Val Ser Asp Gly Val Leu Val Ala Gly Arg Pro Val Thr Phe 
1090 1095 1100 

TAC OCG CAC COG CTG OOC TOG OCT GGG GGT GTT CTT TAC AOG TGG GAC 3358 
Tyr Pro His Pro Leu Pro Ser Pro Gly Gly Val Leu Tyr Thr Trp Asp 
1105 1110 1115 

TTC GGG GAC GGC TOC OCT GTC CTG ACC CAG AGC CAG OOG GCT GOC AAC 3406 
Phe Gly Asp Gly Ser Pro Val Lai Thr Gin Ser Gin Pro Ala Ala Asn 
1120 1125 1130 1135 

CAC AOC TAT GOC TOG AGG GGC AOC TAC CAC GTG OGC CTG GAG GTC AAC 3454 
His Thr Tyr Ala Ser Arg Gly Thr Tyr His Val Arg Leu Glu Val Asn 
1140 1145 ~ 1150 

AAC AOG GTG AGC GGT GOG GOG GOC CAG GOG GAT GTG OGC GTC TTT GAG 3502 
Asn Thr Val Ser Gly Ala Ala Ala Gin Ala Asp Val Arg Val Phe Glu 
1155 1160 1165 

GAG CTC OGC GGA CTC AGC GTG GAC ATG AGC CTG GOC GTG GAG CAG GGC 3550 
Glu Leu Arg Gly Leu Ser Val Asp Met Ser Leu Ala Val Glu Gin Gly 
1170 1175 1180 

GOC COC GTG GTG GTC AGC GOC GOG GTG CAG AOG GGC GAC AAC ATC AOG 3598 
Ala Pro Val Val Val Ser Ala Ala Val Gin Thr Gly Asp Asn lie Thr 
1185 1190 1195 



SUBSTITUTE SHEET (RULE 26) 



5DOCID: <W0_9518225A1_I_> 



WO 95/18225 PCT/GB94/02822 

17/58 

TGG ACC TIC GAC ATG GGG GAC GGC AOC GTG CTG "ICG GGC OOG GAG GCA 3646 
Trp Thr Phe Asp Met Gly Asp Gly Thr Val Leu Ser Gly Pro Glu Ala 
1200 1205 1210 1215 

ACA GTG GAG CAT GTG TAC CTG CGG GCA CAG AAC TGC ACA GTG AOC GTG 3694 
Thr Val Glu His Val Tyr Leu Arg Ala Gin Asn Cys Thr Val Thr Val 
1220 1225 1230 

GGT GCG GCC AGC CCC GCC GGC CAC CTG GGC CGG AGC CTG CAC GTG CTG 3742 
Gly Ala Ala Ser Pro Ala Gly His Leu Ala Arg Ser Leu His Val Leu 
1235 1240 ~ 1245 

GTC TTC GTC CTG GAG GTG CTG CGC GTT GAA CCC GCC GCC TGC ATC CCC 3790 
Val Phe Val Leu Glu Val Leu Arg Val Glu Pro Ala Ala Cys lie Pro 
1250 1255 1260 

ACG CAG OCT GAC GCG CGG CTC ACG GCC TAC GTC ACC GGG AAC COG GCC 3838 
Thr Gin Pro Asp Ala Arg Leu Thr Ala Tyr Val Thr Gly Asn Pro Ala 
1265 1270 * 1275 



CAC TAC CTC TTC GAC TGG ACC TTC GGG GAT GGC TCC TCC AAC ACG ACC 
His Tyr Leu Phe Asp Trp Thr Phe Gly Asp Gly Ser Ser Asn Thr Thr 
1280 1285 1290 1295 



3886 



GTG CGG GGG TGC COG ACG GTG ACA CAC AAC TTC ACG CGG AGC GGC ACG 3934 
Val Arg Gly Cys Pro Thr Val Thr His Asn Phe Thr Arg Ser Gly Thr 
1300 1305 1310 

TTC CCC CTG GOG CTG GTG CTG TCC AGC CGC GTG AAC AGG GCG CAT TAC 3982 
Phe Pro Leu Ala Leu Val Leu Ser Ser Arg Val Asn Arg Ala His Tyr 
1315 1320 1325 

TTC AOC AGC ATC TGC GTG GAG CCA GAG GTG GGC AAC GTC ACC CTG CAG 4030 
Phe Thr Ser lie Cys Val Glu Pro Glu Val Gly Asn Val Thr Leu Gin 
1330 1335 1340 

CCA GAG AGG CAG TIT GTG CAG CTC GGG GAC GAG GCC TGG CTG GTG GCA 4078 
Pro Glu Arg Gin Phe Val Gin Leu Gly Asp Glu Ala Trp Leu Val Ala 
1345 1350 1355 

TGT GCC TGG CCC COG TTC CCC TAC CGC TAC ACC TGG GAC TTT GGC AOC 4126 
Cys Ala Trp Pro Pro Phe Pro Tyr Arg Tyr Thr Trp Asp Phe Gly Thr 
1360 1365 1370 1375 

GAG GAA GCC GCC CCC ACC CGT GCC AGG GGC CCT GAG GTG ACG TTC ATC 4174 
Glu Glu Ala Ala Pro Thr Arg Ala Arg Gly Pro Glu Val Thr Phe lie 
1380 1385 1390 

TAC OGA GAC CCA GGC TCC TAT CTT GIG ACA GTC ACC GOG TCC AAC AAC 4222 
Tyr Arg Asp Pro Gly Ser Tyr Leu Val Thr Val Thr Ala Ser Asn Asn 
1395 1400 1405 

ATC TCT OCT GCC AAT GAC TCA GCC CTG GTG GAG GTG CAG GAG CCC GTG 4270 
lie Ser Ala Ala Asn Asp Ser Ala Leu Val Glu Val Gin Glu Pro Val 
1410 1415 1420 

CTG GTC ACC AGC ATC AAG GTC AAT GGC TCC CTT GGG CTG GAG CTG CAG 4318 
Leu Val Thr Ser He Lys Val Asn Gly Ser Leu Gly Leu Glu Leu Gin 
1425 1430 1435 
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CAG COG TAG CTG TTC TCT GCT GTG GOC OCT GGG OGC OOC GOC AGC TAG 4366 
Gin Pro Tyr Leu Phe Ser Ala Val Gly Arg Gly Arg Pro Ala Ser Tyr 
1440 1445 1450 1455 

CTG TGG GAT CTG GGG GAC GGT GGG TGG CTC GAG GCT COG GAG GTC AOC 4414 
Leu Trp Asp Leu Gly Asp Gly Gly Trp Leu Glu Gly Pro Glu Val Thr 
1460 1465 1470 

CAC GCT TAG AAC AGC ACA GGT GAC TTC AOC GIT AGG GTG GOC GGC TGG 4462 
His Ala Tyr Asn Ser Thr Gly Asp Phe Thr Val Arg Val Ala Gly Trp 
1475 1480 1485 

AAT GAG GTG AGC OGC AGC GAG GOC TGG CTC AAT GTG AGG GTG AAG OGG 4510 
Asn Glu Val Ser . Arg Ser Glu Ala Trp Leu Asn Val Thr Val Lys Arg 
1490 ** 1495 1500 

OGC GTG OGG GGG CTC GTC GTC AAT GCA AGC OGC AOG GTG GTG OOC CTG 4558 
Arg Val Arg Gly Leu Val Val Asn Ala Ser Arg Thr Val Val Pro Leu 
1505 1510 1515 

AAT GGG AGC GTG AGC TTC AGC AOG TOG CTG GAG GOC GGC ACT GAT CTG 4606 
Asn Gly Ser Val Ser Phe Ser Thr Ser Leu Glu Ala Gly Ser Asp Val 
1520 1525 1530 1535 

OGC TAT TOC TGG GTG CTC TGT GAC OGC TGC AOG OCC ATC OCT GGG GGT 4654 
Arg Tyr Ser Trp Val Leu Cys Asp Arg Cys Thr Pro lie Pro Gly Gly 
1540 1545 1550 

OCT AOC ATC TCT TAG AOC TTC OGC TOC CTG GGC AOC TTC AAT ATC ATC 4702 
Pro Thr lie Ser Tyr Thr Phe Arg Ser Val Gly Thr Phe Asn lie lie 
1555 1560 1565 

GTC AOG GCT GAG AAC GAG GTG GGC TCC GOC CAG GAC AGC ATC TTC GTC 4750 
Val Thr Ala Glu Asn Glu Val Gly Ser Ala Gin Asp Ser lie Phe Val 
1570 1575 1580 

TAT GTC CTG CAG CTC ATA GAG GGG CTG CAG GTG GTG GGC GCT GGC OGC 4798 
Tyr Val Leu Gin Leu lie Glu Gly Leu Gin Val Val Gly Gly Gly Art? 
1585 1590 1595 

TAG TTC OOC AOC AAC CAC AOG CTA CAG CTG CAG GOC GTG GIT AGG GAT 4846 
Tyr Phe Pro Thr Asn His Thr Val Gin Leu Gin Ala Val Val Arg Asp 
1600 1605 1610 1615 

GGC AOC AAC GTC TOC TAG AGC TGG ACT GCC TGG AGG GAC AGG GGC COG 4894 
Gly Thr Asn Val Ser Tyr Ser Trp Thr Ala Trp Arg Asp Arg Gly Pro 
1620 1625 1630 

GCC CTG GOC GGC AGC GGC AAA GGC TTC TOG CTC AOC GTG CTC GAG GOC 4942 
Ala Leu Ala Gly Ser Gly Lys Gly Phe Ser Leu Thr Val Leu Glu Ala 
1635 1640 1645 

GGC AOC TAG CAT GTG CAG CTG OGG GOC AOC AAC ATG CTG GGC AGC GOC 4990 
Gly Thr Tyr His Val Gin Leu Arg Ala Thr Asn Met Leu Gly Ser Ala 
1650 1655 1660 

TGG GOC GAC TGC AOC ATG GAC TTC GTG GAG OCT GTG GGG TGG CTG ATG 5038 
Trp Ala Asp Cys Thr Met Asp Phe Val Glu Pro Val Gly Trp Leu Met 
1665 1670 1675 
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GIG AOC GOC ICC COG AAC OCA OCT GOC GTC AAC ACA AGC GTC AOC CTC 5086 
Val Thr Ala Ser Pro Asn Pro Ala Ala Val Asn Thr Ser Val Thr Leu 
1680 1685 1690 1695 

AGT GCC GAG CIG GCT GCT GOC ACT GGT GTC CTA TAC ACT TGG TOC TIG 5134 
Ser Ala Glu Leu Ala Gly Gly Ser Gly Val Val Tyr Thr Trp Ser Leu 
1700 1705 1710 

GAG GAG GOG CIG AGC TGG GAG AOC TOC GAG CCA TTT AOC ACC CAT AGC 5182 
Glu Glu Gly Leu Ser Trp Glu Thr Ser Glu Pro Phe Thr Thr His Ser 
1715 1720 1725 

TTC OOC ACA C0C GGC CTC CAC TIG GTC AOC ATG AOG GCA GGG AAC COG 5230 
Phe Pro Thr Pro Gly Leu His Leu Val Thr Met Thr Ala Gly Asn Pro 
1730 1735 1740 

CIG GGC TCA GOC AAC GOC AOC GIG GAA GTG GAT GIG CAG GIG OCT GIG 5278 
Leu Gly Ser Ala Asn Ala Tfrr Val Glu Val Asp Val Gin Val Pro Val 
1745 1750 1755 

ACT GGC CTC AGC ATC AGG GOC AGC GAG OOC GGA GGC AGC TIC GIG GOG 5326 
Ser Gly Leu Ser lie Arg Ala Ser Glu Pro Gly Gly Ser Phe Val Ala 
1760 1765 1770 1775 

GOC GGG TOC TCT GIG CCC TTT TOG GGG CAG CIG GOC AOG GGC AOC AAT 5374 
Ala Gly Ser Ser Val Pro Phe Trp Gly Gin leu Ala Thr Gly Thr Asn 
1780 1785 1790 

GIG AGC TGG TOC TOG GCT GTG CCC GGC GGC AGC AGC AAG OCT GGC OCT 5422 
Val Ser Trp Cys Trp Ala Val Pro Gly Gly Ser Ser Lys Arg Gly Pro 
1795 1800 1805 

CAT GTC AOC ATG GTC TIC COG GAT GCT GGC AOC TTC TOC ATC OGG CTC 5470 
His Val Thr Met Val Phe Pro Asp Ala Gly Thr Phe Ser lie Arg Leu 
1810 1815 1820 

AAT GOC TOC AAC GCA GTC AGC TGG GTC TCA GOC AOG TAC AAC CTC AOG 5518 
Asn Ala Ser Asn Ala Val Ser Trp Val Ser Ala Thr Tyr Asn Leu Thr 
1825 1830 1835 

GOG GAG GAG OOC ATC GIG GGC CIG GTG CIG TGG GOC AGC AGC AAG GTG 5566 
Ala Glu Glu Pro He Val Gly Leu Val Leu Trp Ala Ser Ser Lys Val 
1840 1845 1850 1855 

GIG GOG OOC GGG CAG CIG GTC CAT TTT CAG ATC CIG CIG GCT GOC GGC 5614 
Val Ala Pro Gly Gin Leu Val His Phe Gin He Leu Leu Ala Ala Gly 
1860 1865 1870 

TCA GCT GTC AOC TTC GGC CIG CAG GTC GGC GGG GCC AAC COC GAG GIG 5662 
Ser Ala Val Thr Phe Arg Leu Gin Val Gly Gly Ala Asn Pro Glu Val 
1875 1880 1885 

CTC OOC GGG OOC OCT TTC TOC CAC AGC TTC CCC GGC GTC GGA GAC CAC 5710 
Leu Pro Gly Pro Arg Phe Ser His Ser Phe Pro Arg Val Gly Asd His 
1890 1895 1900 

GTG GIG AGC GTG OGG GGC AAA AAC CAC GIG AGC TGG GOC CAG GOG CAG 5758 
Val Val Ser Val Arg Gly Lys Asn His Val Ser Trp Ala Gin Ala Gin 
1905 1910 1915 



SUBSTITUTE SHEET (RULE 26) 



5DOCID: <WO 



9518225A1 I > 



WO 95/18225 



PCT/GB94/02822 



20/58 

GTG OGC ATC GTG GTG CTG GAG GOC GTG AGT GGG CTG CAG ATG OOC AAC 5806 
Val Arg He Val Val leu Glu Ala Val Ser Gly Leu Gin Met: Pro Asn 
1920 1925 1930 1935 

TGC TOC GAG OCT GGC ATC GOC ACG GGC ACT GAG AGG AAC TIC ACA GOC 5854 
Cys Cys Glu Pro Gly He Ala Thr Gly Thr Glu Arg Asn Phe Thr Ala 
1940 1945 1950 

OGC GTG CAG OGC GGC TCT OGG GTC GOC TAG GCC TOG TAC TTC TOG CTG 5902 
Arg Val Gin Arg Gly Ser Arg Val Ala Tyr Ala Trp Tyr Phe Ser Leu 
1955 1960 1965 

CAG AAG GTC CAG GGC GAC TOG CTG GTC ATC CTG TOG GGC OGC GAC GTC 5950 
Gin Lys Val Gin Gly Asp Ser Leu Val He Leu Ser Gly Arg Asp Val 
1970 1975 1980 

AOC TAC AOG OOC GTG GCC GOG GGG CTG TTG GAG ATC CAG GTG OGC GOC 5998 
Thr Tyr Thr Pro Val Ala Ala Gly Leu Leu Glu lie Gin Val Arg Ala 
1985 1990 1995 

TTC AAC GOC CTG GGC AGT GAG AAC OGC AOG CTG GTG CTG GAG GIT CAG 6046 
Phe Asn Ala Leu Gly Ser Glu Asn Arg Thr Leu Val Leu Glu Val Gin 
2000 2005 2010 2015 

GAC GOC GTC CAG TAT GIG GOC CTG CAG AGC GGC OOC TGC TTC AOC AAC 6094 
Asp Ala Val Gin Tyr Val Ala Leu Gin Ser Gly Pro Cys Phe Thr Asn 
2020 2025 ~ 2030 

OGC TOG GOG CAG TTT GAG GOC GOC AOC AGC OOC AGC OOC OGG OCT GTG 6142 
Arg Ser Ala Gin Phe Glu Ala Ala Thr Ser Pro Ser Pro Arg Arg Val 
2035 2040 2045 

GOC TAC CAC TGG GAC TTT GGG GAT GGG TOG OCA GGG CAG GAC ACA GAT 6190 
Ala Tyr His Trp Asp Phe Gly Asp Gly Ser Pro Gly Gin Asp Thr Asp 
2050 2055 2060 

GAG OOC AGG GOC GAG CAC TOC TAC CTG AGG OCT GGG GAC TAC OGC GTG * 6238 
Glu Pro Arg Ala Glu His Ser Tyr Leu Arg Pro Gly Asp Tyr Arg Val 
2065 2070 ~ 2075 

CAG GTG AAC GOC TOC AAC CTG GTG AGC TTC TTC GTG GOG CAG GOC AOG 6286 
Gin Val Asn Ala Ser Asn Leu Val Ser Phe Phe Val Ala Gin Ala Thr 
2080 2085 2090 2095 

GIG AOC GTC CAG GTG CTG GOC TGC OGG GAG COG GAG GTG GAC GIG GTC 6334 
Val Thr Val Gin Val Leu Ala Cys Arg Glu Pro Glu Val Asp Val Val 
2100 2105 2110 

CTG COC CTG CAG GTG CTG ATG OGG OGA TCA CAG OGC AAC TAC TTG GAG 6382 
Leu Pro Leu Gin Val Leu Met Arg Arg Ser Gin Arg Asn Tyr Leu Glu 
2115 2120 2125 

GOC CAC GTT GAC CTG OGC GAC TGC GTC AOC TAC CAG ACT GAG TAC OGC 6430 
Ala His Val Asp Leu Arg Asp Cys Val Thr Tyr Gin Thr Glu Tyr Arg 
2130 2135 2140 

TGG GAG GTG TAT OGC AOC GOC AGC TGC CAG OGG OCG GGG OGC OCA GOG • 6478 

Trp Glu Val Tyr Arg Thr Ala Ser Cys Gin Arg Pro Gly Arg Pro Ala 
2145 2150 2155 
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OCT GTG GOC CIG OOC GGC GIG GAC GIG AGC OGG OCT CGG CTG GIG CTG 6526 
Arg Val Ala Leu Pro Gly Val Asp Val Ser Arg Pro Axg Leu Val Leu 
2160 2165 2170 2175 

COG COG CTG GOG CTG OCT GTG GGG CAC TAC TGC TTT GIG TTT GTC GTG 6574 
Pro Arg Leu Ala Leu Pro Val Gly His Tyr Cys Phe Val Phe Val Val 
2180 2185 2190 

TCA TIT GGG GAC AOG OCA CTG ACA CAG AGC ATC CAG GOC AAT GTG AOG 6622 
Ser Phe Gly Asp Thr Pro Leu Thr Gin Ser lie Gin Ala Asn Val Thr 
2195 2200 2205 

GTG GOC OOC GAG OGC CTG GTG OOC ATC ATT GAG GCT GGC TCA TAC OGC 6670 
Val Ala Pro Glu Arg Leu Val Pro lie lie Glu Gly Gly Ser Tyr Arg 
2210 2215 2220 

GIG TGG TCA GAC ACA OGG GAC CIG GIG CIG GAT GGG AGC GAG TOC TAC 6718 
Val Trp Ser Asp Thoc Arg Asp Leu Val Leu Asp Gly Ser Glu Ser Tyr 
2225 2230 2235 

GAC OOC AAC CIG GAG GAC GGC GAC CAG AOG GOG CTC ACT TTC CAC TGG 6766 
Asp Pro Asn Leu Glu Asp Gly Asp Gin Thr Pro Leu Ser Hie His Trp 
2240 2245 2250 2255 

GOC TCT GIG GCT TOG ACA CAG AGG GAG GCT GGC GGG TCT GOG CTG AAC 6814 
Ala Cys Val Ala Ser Thr Gin Arg Glu Ala Gly Gly Cys Ala Leu Asn 
2260 2265 2270 

TTT GGG COC OGC GGG AGC AGC AOG GTC AOC ATT OCA OGG GAG OGG CIG 6862 
Phe Gly Pro Arg Gly Ser Ser Thr Val Thr He Pro Arg Glu Arg Leu 
2275 2280 2285 

GOG GCT GGC GIG GAG TAC AOC TTC AGC CIG ACC GIG TGG AAG GOC GGC 6910 
Ala Ala Gly Val Glu Tyr Thr Phe Ser Leu Thr Val Trp Lys Ala Gly 
2290 2295 2300 

COC AAG GAG GAG GOC ACC AAC CAG AOG GIG CIG ATC OGG ACT GGC OGG 6958 
Arg Lys Glu Glu Ala Thr Asn Gin Thr Val Leu He Arg Ser Gly Arg 
2305 2310 2315 

GIG OCC ATT GIG TOC TIG GAG TGT GIG TOC TGC AAG GCA CAG GOC GIG 7006 
Val Pro He Val Ser Leu Glu Cys Val Ser Cys Lys Ala Gin Ala Val 
2320 2325 2330 2335 

TAC GAA GTG AGC OGC AGC TCC TAC GIG TAC TIG GAG GGC CGC TGC CTC 7054 
Tyr Glu Val Ser Arg Ser Ser Tyr Val Tyr Leu Glu Gly Arg Cys Leu 
2340 2345 * 2350 

AAT TGC AGC AGC GGC TOC AAG OGA GGG OGG TGG GCT GCA OCT AOG TTC 7102 
Asn Cys Ser Ser Gly Ser Lys Arg Gly Arg Trp Ala Ala Arg Thr Phe 
2355 2360 2365 

AGC AAC AAG AOG CIG GIG CIG GAT GAG ACC ACC ACA TOC AOG GGC AGT 7150 
Ser Asn Lys Thr Leu Val Leu Asp Glu Thr Thr Tnr Ser Thr Gly Ser 
2370 2375 2380 

GCA GGC ATG OGA CIG GIG CTG OGG OGG GGC GIG CIG OGG GAC GGC GAG 7198 
Ala Gly Met Arg Leu Val Leu Arg Arg Gly Val Leu Arg Asp Gly Glu 
2385 2390 ~ * 2395 
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GGA TAC AOC TIC AOG CIC AGG GTG CTG GGC OQC TCT GGC GAG GAG GAG 7246 
Gly Tyr Thr Pne Thr Leu Thr Val Leu Gly Arg Ser Gly Glu Glu Glu 
2400 2405 2410 2415 

GGC TGC GOC TOC ATC OGC CTG TOC OOC AAC OQC CCG OOG CTG GGG GGC 7294 
Gly Cys Ala Ser lie Arg Leu Ser Pro Asn Arg Pro Pro Leu Gly Gly 
2420 2425 ** 2430 

TCT TGC CGC CTC TTC OCA CTG GGC GCT GTG CAC GOC CTC AOC ADC AAG 7342 
Ser Cys Arg Leu Phe Pro Leu Gly Ala Val His Ala Leu Thr Thr Lys 
2435 2440 2445 

GTG CAC TTC GAA TGC ADG GGC TOG CAT GAC GOG GAG GAT GCT GGC GOC 7390 
Val His Phe Glu Cys Thr Gly Trp His Asp Ala Glu Asp Ala Gly Ala 
2450 2455 2460 

CCG CTG GTG TAC GOC CTG CTG CTG OGG GGC TGT OGC CAG GGC CAC TGC 7438 
Pro Leu Val Tyr Ala Leu Leu Leu Arg Arg Cys Arg Gin Gly His Cys 
2465 2470 ~ 2475 

GAG GAG TIC TCT CTC TAC AAG GGC AGC CTC TOC AGC TAC GGA GOC GTG 7486 
Glu Glu Phe Cys Val Tyr Lys Gly Ser Leu Ser Ser Tyr Gly Ala Val 
2480 2485 2490 2495 

CTG COC COG GCT TTC AGG OCA CAC TTC GAG GTG GGC CTG GOC CTG GTG 7534 
Leu Pro Pro Gly Phe Arg Pro His Phe Glu Val Gly Leu Ala Val Val 
2500 2505 2510 

GTG CAG GAC CAG CTG GGA GOC GCT GTG CTC GOC CTC AAC AGG TCT TTC 7582 
Val Gin Asp Gin Leu Gly Ala Ala Val Val Ala Leu Asn Arg Ser Leu 
2515 2520 2525 

GOC ATC AOC CTC OCA GAG OOC AAC GGC AGC GCA ADG GGG CTC ACA CTC 7630 
Ala lie Thr Leu Pro Glu Pro Asn Gly Ser Ala Thr Gly Leu Thr Val 
2530 2535 2540 

TQG CTG CAC GGG CTC AOC GCT ACT GTG CTC CCA GGG CTG CTG OGG CAG 7678 
Trp Leu His Gly Leu Thr Ala Ser Val Leu Pro Gly Leu Leu Arg Gin 
2545 "* 2550 2555 

GCC GAT COC CAG CAC CTC ATC GAG TAC TOG TTG GOC CTG GTC AOC GTG 7726 
Ala Asp Pro Gin His Val lie Glu Tyr Ser Leu Ala Leu Val Thr Val 
2560 2565 2570 2575 

CTG AAC GAG TAC GAG OGG GOC CTG GAC GTG GOG GCA GAG COC AAG CAC 7774 
Leu Asn Glu Tyr Glu Arg Ala Leu Asp Val Ala Ala Glu Pro Lys His 
2580 2585 2590 

GAG OGG CAG CAC OGA GOC CAG ATA OGC AAG AAC ATC AOG GAG ACT CTG 7822 
Glu Arg Gin His Arg Ala Gin lie Arg Lys Asn lie Thr Glu Thr Leu 
2595 2600 2605 

GTG TOC CTG AGG GTC CAC ACT GTC GAT GAC ATC CAG CAG ATC GCT GCT 7870 
Val Ser Leu Arg Val His Thr Val Asp Asp lie Gin Gin lie Ala Ala 
2610 2615 2620 

GOG CTG GOC CAG TGC ATG GGG OOC AGC AGG GAG CTC CTA TGC OGC TOG 7918 
Ala Leu Ala Gin Cys Met Gly Pro Ser Arg Glu Leu Val Cys Arg Ser 
2625 2630 2635 
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TGC CTG AAG CAG AOG CTG CAC AAG CTG GAG GCC ATG ATG CTC ATC CTG 7966 
Cys Leu Lys Gin Thr Leu His Lys Leu Glu Ala Met Met; Leu lie Leu 
2640 2645 2650 2655 

CAG OCA GAG ACC ACC GOG GGC AOC GTG AOG OOC ACC GCC ATC GGA GAC 8014 
Gin Ala Glu Thr Thr Ala Gly Thr Val Thr Pro Thr Ala lie Gly Asp 
2660 2665 2670 

AGC ATC CTC AAC ATC ACA GGA GAC CTC ATC CAC CTG GCC AGC TOG GAC 8062 
Ser He Leu Asn He Thr Gly Asp Leu lie His Leu Ala Ser Ser Asp 
2675 2680 2685 



GTG CGG GGA CCA CAG OOC TCA GAG CTG GGA GCC GAG TCA CCA TCT CGG 
Val Arg Ala Pro Gin Pro Ser Glu Leu Gly Ala Glu Ser Pro Ser Am 
2690 2695 2700 



8110 



ATG GTG GOG T0C CAG GCC TAG AAC CTG ACC TCT GCC CTC ATG CGC ATC 8158 
Met Val Ala Ser Gin Ala Tyr Asn Leu Thr Ser Ala Leu Met Am lie 
2705 2710 2715 

CTC ATG CGC T0C CGC GTG CTC AAC GAG GAG COC CTG AOG CTG GCG GGC 8206 
Leu Met Arg Ser Arg Val Leu Asn Glu Glu Pro Leu Thr Leu Ala Glv 
2720 2725 2730 2735 

GAG GAG ATC GTG GCC CAG GGC AAG CGC TOG GAC COG CGG AGC CTG CTG 8254 
Glu Glu lie Val Ala Gin Gly Lys Arg Ser Asp Pro Arg Ser Leu Leu 
2740 2745 ~ 2750 

TGC TAT GGC GGC GCC CCA GGG OCT GGC TGC CAC TTC TOC ATC CCC GAG 8302 
Cys Tyr Gly Gly Ala Pro Gly Pro Gly Cys His Phe Ser He Pro Glu 
2755 2760 2765 

GCT TTC AGC GGG GCC CTG GGC AAC CTC ACT GAC GTG GTG CAG CTC ATC 8350 
Ala Phe Ser Gly Ala Leu Ala Asn Leu Ser Asp Val Val Gin Leu He 
2770 2775 2780 

TTT CTG GTG GAC TOC AAT COC ITT CCC TTT GGC TAT ATC AGC AAC TAG 8398 
Phe Leu Val Asp Ser Asn Pro Phe Pro Phe Gly Tyr He Ser Asn Tvr 
2785 2790 2795 

ACC GTC TOC ACC AAG GTG GCC TOG ATG GCA TTC CAG ACA CAG GCC GGC 8446 
Thr Val Ser Thr Lys Val Ala Ser Met Ala Phe Gin Thr Gin Ala Glv 
2800 2805 2810 2815 

GCC CAG ATC CCC ATC GAG CGG CTG GCC TCA GAG CGC GCC ATC ACC GTG 8494 
Ala Gin He Pro He Glu Arg Leu Ala Ser Glu Arg Ala He Thr Val 
2820 2825 2830 

AAG GTG CCC AAC AAC TOG GAC TGG GCT GCC CGG GGC CAC CGC AGC TCC 8542 
Lys Val Pro Asn Asn Ser Asp Trp Ala Ala Arg Gly His Arg Ser Ser 
2835 2840 " 2845 

GCC AAC TCC GCC AAC TCC GTT GTG GTC CAG CCC CAG GCC TCC GTC GGT 8590 
Ala Asn Ser Ala Asn Ser Val Val Val Gin Pro Gin Ala Ser Val Glv 
2850 2855 2860 

GCT GTG GTC ACC CTG GAC AGC -AGC AAC OCT GCG GCC GGG CTG CAT CTG 8638 
Ala Val Val Thr Leu Asp Ser Ser Asn Pro Ala Ala Gly Leu His Leu 
2865 2870 2875 
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CAG CTC AAC TAT AOG CTG CTG GAC GGC CAC TAC CTG TCT GAG GAA OCT 8686 
Gin Leu Asn Tyr Thr Leu Leu .\sp Gly His Tyr Leu Ser Glu Glu Pro 
2880 2885 2890 2895 

GAG COC TAC CTG GCA GTC TAC CTA CAC TOG GAG COC OGG OOC AAT GAG 8734 
Glu Pro Tyr Leu Ala Val Tyr Leu His Ser Glu Pro Arg Pro Asn Glu 
2900 2905 2910 

CAC AAC TGC TOG GCT AGO AGG AGG ATC OGC OCA GAG TCA CTC CAG GGT 8782 
His Asn Cys Ser Ala Ser Arg Arg lie Arg Pro Glu Ser Leu Gin Gly 
2915 2920 ^ 2925 

GOT GAC CAC OGG OOC TAC A0C TTC TIC ATT TOC 00G GGG AGC AGA GAC 8830 
Ala Asp His Arg Pro Tyr Thr Phe Phe lie Ser Pro Gly Ser Arg Asp 
2930 2935 2940 

OCA GOG GGG ACT TAC CAT CTG AAC CTC TOC AGC CAC TIC OGC TGG TOG 8878 
Pro Ala Gly Ser Tyr His Leu Asn Leu Ser Ser His Phe Arg Trp Ser 
2945 ^ 2950 2955 

GOG CTG CAG CTG TOC GIG GGC CTG TAC AOG TOC CTG TGC CAG TAC TTC 8926 
Ala Leu Gin Val Ser Val Gly Leu Tyr Thr Ser Leu Cys Gin Tyr Phe 
2960 2965 2970 2975 

AGC GAG GAG GAC ATG CTG TGG OGG ACA GAG GGG CIG CTG COC CTG GAG 8974 
Ser Glu Glu Asp Met Val Trp Arg Thr Glu Gly Leu Leu Pro Leu Glu 
2980 2985 2990 

GAG AOC TOG OOC OGC CAG OOC CTC TGC CTC AOC OGC CAC CTC ACC GOC 9022 
Glu Thr Ser Pro Arg Gin Ala Val Cys Leu Thr Arg His Leu Thr Ala 
2995 3000 3005 

TTC GGC GOC AGC CTC TIC GTG OOC OCA AGC CAT CTC OGC TTT GTG TIT 9070 
Phe Gly Ala Ser Leu Phe Val Pro Pro Ser His Val Arg Phe Val Phe 
3010 3015 3020 

OCT GAG O0G ACA GOG GAT CTA AAC TAC ATC GTC ATG CTG ACA TCT GCT 9118 
Pro Glu Pro Thr Ala Asp Val Asn Tyr lie Val IVtet Leu Thr Cys Ala 
3025 3030 3035 

GTG TGC CTG CTG AOC TAC ATG GTC ATG GOC GOC ATC CTG CAC AAG CTG 9166 
Val Cys Leu Val Thr Tyr Met Val Met Ala Ala lie Leu His Lys Leu 
3040 3045 3050 3055 

GAC CAG TIG GAT GOC AGC OGG GGC OGC GOC ATC OCT TTC TCT GGG CAG 9214 
Asp Gin Leu Asp Ala Ser Arg Gly Arg Ala lie Pro Phe Cys Gly Gin 
3060 3065 3070 

OGG GGC OGC TTC AAG TAC GAG ATC CTC GTC AAG ACA GGC TGG GGC OGG 9262 
Arg Gly Arg Phe Lys Tyr Glu lie Leu Val Lys Thr Gly Trp Gly Arg 
3075 3080 3085 

GGC TCA GGT AOC AOG GOC CAC GTG GGC ATC ATG CTG TAT GGG GIC GAC 9310 
Gly Ser Gly Thr Thr Ala His Val Gly lie Met Leu Tyr Gly Val Asp 
3090 3095 3100 

AGC OGG AGC GGC CAC OGG CAC CTG GAC GGC GAC AGA GOC TIC CAC OGC 9358 
Ser Arg Ser Gly His Arg His Leu Asp Gly Asp Arg Ala Phe His Arg 
3105 3110 3115 
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AAC AGC CTG GAC ATC TTC OGG ATC GOC ACC COG CAC AGC CTG GGT AGC 9406 
Asn Ser Leu Asp lie Phe Arg lie Ala Thr Pro His Ser Leu Gly Ser 
3120 3125 3130 3135 

GTG TGG AAG ATC CGA GTG TGG CAC GAC AAC AAA GGG CTC AGC OCT GOC 9454 
Val Trp Lys He Arg Val Trp His Asp Asn Lys Gly Leu Ser Pro Ala 
3140 3145 3150 

TGG TTC CTG CAG CAC GTC ATC GTC AGG GAC CTG CAG ACG OCA CGC AGC 9502 
Trp Phe Leu Gin His Val lie Val Ara Asp Leu Gin Thr Ala Arg Ser 
3155 3160 3165 

GOC TTC TTC CTG GTC AAT GAC TGG CTT TOG GTG GAG ACG GAG GOC AAC 9550 
Ala Phe Phe Leu Val Asn Asp Trp Leu Ser Val Glu Thr Glu Ala Asn 
3170 3175 3180 

GGG GGC CTG GTG GAG AAG GAG GTG CTG GOC GOG AGC GAC GCA GOC CTT 9598 
Gly Gly Leu Val Glu Lys Glu Val Leu Ala Ala Ser Asp Ala Ala Leu 
3185 3190 3195 

TTG CGC TTC COG CGC CTG CTG GTG OCT GAG CTG CAG CGT GGC TTC TTT 9646 
Leu Arg Phe Arg Arg Lai Leu Val Ala Glu Leu Gin Arg Gly Phe Phe 
3200 3205 3210 3215 

GAC AAG CAC ATC TGG CTC TOO ATA TGG GAC CGG COG OCT CGT AGC CGT 9694 
Asp Lys His He Trp Leu Ser He Trp Asp Arg Pro Pro Arg Ser Arg 
3220 3225 3230 

TTC ACT CGC ATC CAG AGG GOC ACC TGC TGC GOT CTC CTC ATC TGC CTC 9742 
Phe Thr Arg He Gin Arg Ala Thr Cys Cys Val Leu Leu He Cys Leu 
3235 3240 3245 

TTC CTG GGC GOC AAC GOC GTG TGG TAC GGG GOT GTT GGC GAC TOT GOC 9790 
Phe Leu Gly Ala Asn Ala Val Trp Tyr Gly Ala Val Gly Asp Ser Ala 
3250 3255 3260 



TAC AGC ACG GGG CAT GTG TOO AGG CTG AGC COG CTG AGC GTC GAC ACA 
Tyr Ser Thr Gly His Val Ser Arg Leu Ser Pro Leu Ser Val Asd Thr 
3265 3270 3275 



9838 



GTC GOT GTT GGC CTG GTG TOC AGC GTG GTT GTC TAT COC GTC TAC CTG 9886 
Val Ala Val Gly Leu Val Ser Ser Val Val Val Tyr Pro Val Tyr Leu 
3280 3285 3290 3295 

GOC ATC CTT TTT CTC TTC CGG AUG TOC CGG AGC AAG GTG GOT GGG AGC 9934 
Ala lie Leu Phe Leu Phe Arg Met Ser Arg Ser Lys Val Ala Gly Ser 
3300 3305 ~ 3310 

COG AGC COC ACA OCT GOC GGG CAG CAG GTG CTG GAC ATC GAC AGC TGC 9982 
Pro Ser Pro Thr Pro Ala Gly Gin Gin Val Leu Asp He Asp Ser Cys 
3315 3320 3325 

CTG GAC TOG TOC GTG CTG GAC AGC TOC TTC CTC ACG TTC TCA GGC CTC 10030 
leu Asp Ser Ser Val Leu Asp Ser Ser Phe Leu Thr Phe Ser Gly Leu 
3330 3335 3340 

CAC GOT GAG GOC TTT GTT GGA CAG ATG AAG ACT GAC TTG TTT CTG GAT 10078 
His Ala Glu Ala Phe Val Gly Gin Met Lys Ser Asp Leu Phe Leu Aso 
3345 3350 3355 
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GAT TCT AAG AGT CTG GTG TGC TGG OOC TOC GOC GAG GGA AOG CTC ACT 10126 
Asp Ser Lys Ser Leu Val Cys Trp Pro Ser Gly Glu Gly Thr Leu Ser 
3360 3365 3370 3375 

TGG COG GAC CTG CTC ACT GAC COG TOC ATT CTG GGT AGC AAT CTG OGG 10174 
Trp Pro Asp Leu Leu Ser Asp Pro Ser lie Val Gly Ser Asn Leu Arg 
3380 3385 3390 

CAG CTG GCA OGG GGC CAG GOG GGC CAT GGG CTG GGC OCA GAG GAG GAC 10222 
Gin Leu Ala Arg Gly Gin Ala Gly His Gly Leu Glv Pro Glu Glu Asp 
3395 3400 3405 

GGC TTC TOC CTG GOC AGC OOC TAG TOG OCT GOC AAA TOC TIC TCA GCA 10270 
Gly Phe Ser Leu Ala Ser Pro Tyr Ser Pro Ala Lys Ser Phe Ser Ala 
3410 3415 3420 

TCA GAT GAA GAC CTG ATC CAG CAG GTC CTT GOC GAG GGG CTC AGC AGC 10318 
Ser Asp Glu Asp Leu lie Gin Gin Val Leu Ala Glu Gly Val Ser Ser 
3425 3430 3435 

OCA GOC CCT AGC CAA GAC AOC CAC ATG GAA AOG GAC CTG CTC AGC AGC 10366 
Pro Ala Pro Thr Gin Asp Thr His Met Glu Thr Asp Leu Leu Ser Ser 
3440 3445 3450 3455 

CTG TOC AGC ACT OCT GGG GAG AAG ACA GAG AOG CTG GOG CTG CAG AGG 10414 
Leu Ser Ser Thr Pro Gly Glu Lys Thr Glu Thr leu Ala Leu Gin Arg 
3460 ~ 3465 3470 

CTG GGG GAG CTG GGG CCA OOC AGC OCA GGC CTG AAC TGG GAA CAG OOC 10462 
Leu Gly Glu Leu Gly Pro Pro Ser Pro Gly Leu Asn Trp Glu Gin Pro 
3475 3480 3485 

CAG GCA GOG AGG CTG TOC AGG ACA GGA CTG CTG GAG GGT CTG OGG AAG 10510 
Gin Ala Ala Arg Leu Ser Arg Thr Gly Leu Val Glu Gly leu Arg Lys 
3490 3495 3500 

OGC CTG CTG COG GOC TGG TCT GOC TOC CTG GOC CAC GGG CTC AGC CTG 10558 
Arg Leu leu Pro Ala Trp Cys Ala Ser Leu Ala His Gly Leu Ser Leu 
3505 3510 3515 

CTC CTG GTG GCT GTG GCT GTG GCT GTC TCA GGG TGG CTG GGT GGG AGC 10606 
Leu Leu Val Ala Val Ala Val Ala Val Ser Gly Trp Val Gly Ala Ser 
3520 3525 3530 3535 

TTC OOC O0G GGC GTG ACT GIT GOG TGG CTC CTG TOC AGC AGC GOC AGC 10654 
Phe Pro Pro Gly Val Ser Val Ala Trp Leu Leu Ser Ser Ser Ala Ser 
3540 3545 3550 

TTC CTG GOC TCA TTC CTC GGC TGG GAG OCA CTG AAG GTC TTG CTG GAA 10702 
Phe Leu Ala Ser Phe Leu Gly Trp Glu Pro Leu Lys Val Leu Leu Glu 
3555 3560 3565 

GOC CTG TAC TTC TCA CTG GIG GOC AAG OGG CTG CAC O0G GAT GAA GAT 10750 
Ala Leu Tyr Hie Ser Leu Val Ala Lys Arg Leu His Pro Asp Glu Asp 
3570 3575 3580 

GAC AOC CTG CTA GAG AGC O0G GCT GIG AOG OCT GTG AGC GCA OCT GIG 10798 
Asp Thr Leu Val Glu Ser Pro Ala Val Thr Pro Val Ser Ala Arg Val 
3585 3590 3595 
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OOC OOC CTA COG OCA OCC CAC GGC TIT GCA CTC TTC CTG GOC AAG GAA 10846 
Pro Arg Val Arg Pro Pro His Gly Phe Ala Leu Phe Leu Ala Lys Glu 
3600 3605 3610 3615 

GAA GOC OGC AAG CTC AAG AGG CTA CAT GGC ATG CTG CGG AGO CTC CTG 10894 
Glu Ala Arg Lys Val Lys Arg Leu His Gly Met Leu Arg Ser Leu Leu 
3620 3625 3630 

CTG TAC ATG CTT TTT CTG CTG CTG ADC CTG CTG GOC AGC TAT QGG GAT 10942 
Val Tyr Met Leu Phe Leu Leu Val Thr Leu Leu Ala Ser Tyr Gly Asp 
3635 3640 3645 

GOC TCA TGC CAT GGG CAC GOC TAC OCT CTG CAA AGC GOC ATC AAG CAG 10990 
Ala Ser Cys His Gly His Ala Tyr Arg Leu Gin Ser Ala He Lys Gin 
3650 3655 3660 

GAG CTG CAC AGC CGG GOC TTC CTG GOC ATC ACG CGG TCT GAG GAG CTC 11038 
Glu Leu His Ser Arg Ala Phe Leu Ala He Thr Arg Ser Glu Glu Leu 
3665 3670 3675 

TOG CCA TGG ATG GOC CAC GTG CTG CTG CCC TAC GTC CAC GGG AAC CAG 11086 
Trp Pro Trp Met Ala His Val Leu Leu Pro Tyr Val His Gly Asn Gin 
3680 3685 3690 3695 

TCC AGC CCA GAG CTG GGG CCC CCA CGG CTG CGG CAG GTG CGG CTG CAG 11134 
Ser Ser Pro Glu Leu Gly Pro Pro Arg Leu Arg Gin Val Arg Leu Gin 
3700 3705 3710 

GAA GCA CTC TAC CCA GAC OCT CCC GGC CCC AGG GTC CAC ACG TGC TOG 11182 
Glu Ala Leu Tyr Pro Asp Pro Pro Gly Pro Arg Val His Thr Cys Ser 
3715 3720 3725 

GCC GCA GGA GGC TTC AGC AGC AGC GAT TAC GAC CTT GGC TGG GAG ACT 11230 
Ala Ala Gly Gly Phe Ser Thr Ser Asp Tyr Asp Val Gly Trp Glu Ser 
3730 3735 3740 

OCT CAC AAT GGC TOG GGG AGG TGG GCC TAT TCA GOG COG GAT CTG CTG 11278 
Pro His Asn Gly Ser Gly Thr Trp Ala Tyr Ser Ala Pro Asp Leu Leu 
3745 3750 3755 

GGG GCA TGG TCC TGG GGC TCC TGT GCC GTG TAT GAC AGC GGG GGC TAC 11326 
Gly Ala Trp Ser Trp Gly Ser Cys Ala Val Tyr Asp Ser Gly Glv Tvr 
3760 3765 3770 3775 

GTG CAG GAG CTG GGC CTG AGC CTG GAG GAG AGC OGC GAC CGG CTG OGC 11374 
Val Gin Glu Leu Gly Leu Ser Leu Glu Glu Ser Arg Asp Arg Leu Arq 
3780 3785 " 3790 

TTC CTG CAG CTG CAC AAC TGG CTG GAC AAC AGG AGC OGC GOT GTG TTC 11422 
Phe Leu Gin Leu His Asn Trp Leu Asp Asn Arg Ser Arg Ala Val Phe 
3795 3800 ~ 3805 

CTG GAG CTC ACG CGG TAC AGC COG GCC GTG GGG CTG CAC GCC GCC GTC 11470 
Leu Glu Leu Thr Arg Tyr Ser Pro Ala Val Gly Leu His Ala Ala Val 
3810 3815 3820 

ACG CTG CGC CTC GAG TTC COG GOG GCC GGC OGC GCC CTG GCC GCC CTC 11518 
Thr Leu Arg Leu Glu Phe Pro Ala Ala Gly Arg Ala Leu Ala Ala Leu 
3825 3830 " 3835 
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AGC GTC CGC COC TTT GOG CTG OGC CGC CTC AGC GOG GGC CIC TOG CTG 11566 
Ser Val Arg Pro Phe Ala Leu Arg Arg Leu Ser Ala Gly Leu Ser Leu 
3840 ^ 3845 3850 3855 

OCT CTG CTC AGO TOG GIG TGC CTG CTG CTG TTC GOC GTG CAC TIC GOC 11614 
Pro Leu Leu Thr Ser Val Cys Leu Leu Leu Phe Ala Val His Phe Ala 
3860 " 3865 3870 

GTG GOC GAG GOC OCT ACT TGG CAC AGG GAA GGG OGC TGG OGC GTG CTG 11662 
Val Ala Glu Ala Arg Thr Trp His Arg Glu Gly Arg Trp Arg Val leu 
3875 3880 3885 

OGG CTC GGA GOC TGG GOG OGG TGG CTG CTG GTG GOG CTG AGG GOG GOC 11710 
Arg Leu Gly Ala Trp Ala Arg Trp Leu Leu Val Ala Leu Thr Ala Ala 
3890 ^ 3895 3900 

AGG GCA CTG GTA OGC CTC GOC CAG CTG GGT GOC GOT GAC OGC CAG TGG 11758 
Thr Ala Leu Val Arg Leu Ala Gin Leu Gly Ala Ala Asp Arg Gin Trp 
3905 3910 3915 

AOC OCT TTC GTG OGC GGC OGC OOG OGC OGC TTC ACT AGC TTC GAC CAG 11806 
Thr Arg Phe Val Arg Gly Arg Pro Arg Arg Phe Thr Ser Phe Asp Gin 
3920 3925 3930 3935 

GTG GOG CAC GTG AGC TOC GCA GOC OCT GGC CTG GOG GOC TOG CTG CTC 11854 
Val Ala His Val Ser Ser Ala Ala Arg Gly Leu Ala Ala Ser Leu Leu 
3940 3945 3950 

TTC CTG CTT TTG GTC AAG GOT GOC CAG CAC GTA OGC TTC GTG CGC CAG 11902 
Phe Leu Leu Leu Val Lys Ala Ala Gin His Val Arg Phe Val Arg Gin 
3955 3960 3965 

TGG TOC GTC TTT GGC AAG ACA TTA TGC OGA GCT CTG OCA GAG CTC CTG 11950 
Trp Ser Val Phe Gly Lys Thr Leu Cys Arg Ala Leu Pro Glu Leu Leu 
3970 3975 3980 

GGG GTC AOC TTG GGC CTG GTG GTG CTC GGG GTA GOC TAG GOC CAG CTG 11998 
Gly Val Thr Leu Gly Leu Val Val Leu Gly Val Ala Tyr Ala Gin Leu 
3985 ~ 3990 3995 

GOC ATC CTG CTC GTG TCT TOC TGT GTG GAC TOC CTC TGG AGC GTG GOC 12046 
Ala lie Leu Leu Val Ser Ser Cys Val Asp Ser Leu Trp Ser Val Ala 
4000 4005 4010 4015 

CAG GOC CTG TTG GTG CTG TGC OCT GGG ACT GGG CTC TCT ACC CTG TGT 12094 
Gin Ala Leu Leu Val Leu Cys Pro Gly Thr Gly Leu Ser Thr Leu Cys 
4020 4025 4030 

OCT GOC GAG TOC TGG CAC CTG TCA CCC CTG CTG TGT GTG GGG CTC TGG 12142 
Pro Ala Glu Ser Trp His Leu Ser Pro Leu Leu Cys Val Gly Leu Trp 
4035 4040 4045 

GCA CTG OGG CTG TGG GGC GOC CTA OGG CTG GGG GCT GIT ATT CTC CGC 12190 
Ala Leu Arg Leu Trp Gly Ala Leu Arg Leu Gly Ala Val lie Leu Arg 
4050 ~ 4055 4060 

TGG CGC TAG CAC GOC TTG OCT GGA GAG CTG TAC OGG OOG GOC TGG GAG 12238 
Trp Arg Tyr His Ala Leu Arg Gly Glu Leu Tyr Arg Pro Ala Trp Glu 
4065 4070 4075 
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CCC CAG GAC TAC GAG ATG GTG GAG TTG TIC CTG OGC AGG CTG CGC CTC 12286 
Pro Gin Asp Tyr Glu Met Val Glu Leu Phe Leu Arg Arg Leu Arg Leu 
4080 4085 4090 4095 

TGG ATG GGC CTC AGC AAG GTC AAG GAG TIC CGC CAC AAA GTC CGC TTT 12334 
Trp Met Gly Leu Ser Lys Val Lys Glu Phe Arg His Lys Val Arg Phe 
4100 4105 4110 

GAA GGG ATG GAG COG CTG CCC TCT CGC TCC TCC AGG GGC TCC AAG GTA 12382 
Glu Gly Met Glu Pro Leu Pro Ser Arg Ser Ser Arg Gly Ser Lys Val 
4115 4120 ~ 4125 

TCC COG GAT GTG CCC CCA CCC AGC OCT GGC TCC GAT GCC TOG CAC CCC 12430 
Ser Pro Asp Val Pro Pro Pro Ser Ala Gly Ser Asp Ala Ser His Pro 
4130 4135 4140 

TOO AGC TCC TOO AGC CAG CTG GAT GGG CTG AGC GTG AGC CTG GGC GGG 12478 
Ser Thr Ser Ser Ser Gin Leu Asp Gly Leu Ser Val Ser Leu Gly Ara 
4145 4150 " 4155 

CTG GGG ACA AGG TGT GAG OCT GAG COO TOO CGC CTC CAA GOO GTG TTC 12526 
Leu Gly Thr Arg Cys Glu Pro Glu Pro Ser Arg Leu Gin Ala Val Phe 
4160 4165 4170 4175 

GAG GCC CTG CTC ACC CAG TTT GAC CGA CTC AAC CAG GCC ACA GAG GAC 12574 
Glu Ala Leu Leu Thr Gin Phe Asp Arg Leu Asn Gin Ala Thr Glu Asp 
4180 4185 4190 

GTC TAC CAG CTG GAG CAG CAG CTG CAC AGC CTG CAA GGC CGC AGG AGC 12622 
Val Tyr Gin Leu Glu Gin Gin Leu His Ser Leu Gin Gly Ara Ara Ser 
4195 4200 4205 

AGC COG GOG CCC GCC GGA TCT TCC CGT GGC CCA TOO COG GGC CTG COG 12670 
Ser Arg Ala Pro Ala Gly Ser Ser Arg Gly Pro Ser Pro Gly Leu Ara 
4210 4215 ~ 4220 

CCA GCA CTG CCC AGC CGC CTT GCC GGG GCC ACT CGG GGT GTG GAC CTG 12718 
Pro Ala Leu Pro Ser Arg Leu Ala Arg Ala Ser Arg Gly Val Asd Leu 
4225 4230 4235 

GCC ACT GGC CCC AGC AGG ACA COT TOG GGC CAA GAA CAA GGT CCA CCC 12766 
Ala Thr Gly Pro Ser Arg Thr Pro Ser Gly Gin Glu Gin Gly Pro Pro 
4240 4245 4250 4255 

CAG CAG CAC TTA GTC CTC CTT OCT GGC GGG GGT GGG COG TGG ACT CGG 12814 
Gin Gin His Leu Val Leu Leu Pro Gly Gly Gly Gly Pro Trp Ser Ara 
4260 4265 4270 

ACT GGA CAC CGC TCA GTA TTA CTT TCT GCC GOT GTC AAG GCC GAG GGC 12862 
Ser Gly His Arg Ser Val Leu Leu Ser Ala Ala Val Lys Ala Glu Gly 
4275 4280 4285 

CAG GCA GAA TGG CTG CAC GTA GGT TCC CCA GAG AGC AGG CAG GGG CAT 12910 
Gin Ala Glu Trp Leu His Val Gly Ser Pro Glu Ser Arg Gin Gly His 
4290 4295 4300 

CTG TCT GTC TCT GGG CTT CAG CAC TTT AAA GAG GOT GTG TGG CCA ACC 12958 
Leu Ser Val Cys Gly Leu Gin His Phe Lys Glu Ala Val Trp Pro Thr 
4305 4310 4315 
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AGG AOC CAG GGT OOC CTC OOC AOC TOC CTT GGG AAG GAC ACA GCA GTA 
Arg Thr Gin Gly Pro Leu Pro Ser Ser Leu Gly Lys Asp Thr Ala Val 
4320 " 4325 4330 4335 

TIG GAC GGT TTC TAGOCTCTGA GATGCTAATT TATTT0C0OG AGTOCTCAGG 
Leu Asp Gly Phe 



13006 



13058 



TACAGOGGGC TCTQOOCGGC CXXAOOOOCT GQGCAGATGT 

GGCTTCAGGG AGGGTETAGOC TGCAO0G00G OCAOOCTGOC 

GTTOCTADOG TACTOOCTGC AOOGTCTCAC TGTGTGTCTC 

GTTAAAATGT GTATATTTTT CTATGTCACT ATTTTCACTA 

AGAGCTGGOC T0000CAACA GCTGCTGOGC TTGGTAGCTG 

GGCTGCTGCT TGGATGOGAG CTTGGOCTTG GGOOGGTGCT 

GGCACICTCA TCAOCOCAGA GGOCTTGTCA TOCTCCCTTG 

GAGCAGOGOC CAGGOCTGCT GGGATCAGCT CTGQGCAAGT 

AGGAOOOCAG GCTGGITAGA GGAAAAGACT CCTOCTGGGG 

AAGGTGACTG TCTCTCTCTG TCTGTGOGOG CX30GAOGCX3C 

GCAGOCTCAA GGOCTTOGGA GCTGGCTGTG OCTGCTTCTG 

GQOOGCTTCT AGAGQCTCGA CACOOOCCCA AOOOCOGCAC 

AAGAGCTGTC TGACTGCAAA AAAAAAAAA 

(ii) MOLEX^JLE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Gly Ala Ala Cys Arg Val Asn Cys Ser Gly Arg 
15 lO 

Gly Pro Ala Leu Arg lie Pro Ala Asp Ala Thr 
20 25 

His Asn Leu Leu Arg Ala Leu Asp Val Gly Leu 
35 40 

Ala Leu Ala Glu Leu Asp lie Ser Asn Asn Lys 
50 55 

Glu Gly lie Phe Ala Asn Leu Phe Asn Leu Ser 
65 70 75 

Gly Asn Pro Phe Glu Cys Asp Cys Gly Leu Ala 
85 90 

Ala Glu Glu Gin Gin Val Arg Val Val Gin Pro 
100 105 

Ala Gly Pro Gly Ser Leu Ala Gly Gin Pro Leu 
115 120 



OOOCCACTGC 
CCTAACTTAT 
CTGrTCAGTAA 
GGGCTGAGGG 
TGGTGGCGIT 
GGGGGCACAG 
COCXy^GGOCA 
AGCAGGACTA 
GCTQGCTOCC 
GAGTCTGCTG 
TGTACCACTr 
CAAGCAGACA 



TAAGGCTGCT 
TACCTCTOCA 
TTTATATGGT 
GOCTGOGOOC 
ATGGCAGOOC 



GGTAGCAAGA 
GGCATGTCAG 
AGQGTGGAGG 
TATGGCCCAG 
CTGTGGGCAT 
AAGTTCAATAA 



Gly Leu Arg Thr Leu 
15 

Ala Leu Asp Val Ser 
30 

Leu Ala Asn Leu Ser 
45 

lie Ser Thr Leu Glu 
60 

Glu lie Asn Leu Ser 
80 

Trp Leu Pro Arg Trp 
95 

Glu Ala Ala Thr Cys 
110 

Leu Gly He Pro Leu 
125 



13118 
13178 
13238 
13298 
13358 
13418 
13478 
13538 
13598 
13658 
13718 
13778 
13807 
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Leu Asp Ser Gly Cys Gly Glu Glu Tyr Val Ala Cys Leu Pro Asp Asn 
130 135 140 

Ser Ser Gly Thr Val Ala Ala Val Ser Phe Ser Ala Ala His Glu Gly 
145 150 155 160 

Leu Leu Gin Pro Glu Ala Cys Ser Ala Phe Cys Phe Ser Thr Gly Gin 
165 170 175 

Gly Leu Ala Ala Leu Ser Glu Gin Gly Trp Cys Leu Cys Gly Ala Ala 
180 185 190 

Gin Pro Ser Ser Ala Ser Phe Ala Cys Leu Ser Leu Cys Ser Gly Pro 
195 200 205 

Pro Pro Pro Pro Ala Pro Thr Cys Arg Gly Pro Thr Leu Leu Gin His 
210 215 220 

Val Phe Pro Ala Ser Pro Gly Ala Thr Leu Val Gly Pro His Glv Pro 
225 230 235 240 

Leu Ala Ser Gly Gin Leu Ala Ala Phe His He Ala Ala Pro Leu Pro 
245 250 255 

Val Thr Ala Thr Arg Trp Asp Phe Gly Asp Gly Ser Ala Glu Val Asd 
260 265 " 270 

Ala Ala Gly Pro Ala Ala Ser His Arg Tyr Val Leu Pro Gly Am Tvr 
275 280 285 

His Val Thr Ala Val Leu Ala Leu Gly Ala Gly Ser Ala Leu Leu Glv 
290 295 300 

Thr Asp Val Gin Val Glu Ala Ala Pro Ala Ala Leu Glu Leu Val Cvs 
305 310 315 320 

Pro Ser Ser Val Gin Ser Asp Glu Ser Leu Asp Leu Ser He Gin Asn 
325 330 335 

Arg Gly Gly Ser Gly Leu Glu Ala Ala Tyr Ser He Val Ala Leu Glv 
340 345 350 

Glu Glu Pro Ala Arg Ala Val His Pro Leu Cys Pro Ser Asp Thr Glu 
355 360 365 

He Phe Pro Gly Asn Gly His Cys Tyr Arg Leu Val Val Glu Lvs Ala 
370 375 ~ 380 

Ala Trp Leu Gin Ala Gin Glu Gin Cys Gin Ala Trp Ala Gly Ala Ala 
385 390 395 400 

Leu Ala Met Val Asp Ser Pro Ala Val Gin Arg Phe Leu Val Ser Aro 
405 410 " 415 

Val Thr Arg Ser Leu Asp Val Trp He Gly Phe Ser Thr Val Gin Glv 
420 425 " 430 

Val Glu Val Gly Pro Ala Pro Gin Gly Glu Ala Phe Ser Leu Glu Ser 
435 440 445 

Cys Gin Asn Trp Leu Pro Gly Glu Pro His Pro Ala Thr Ala Glu His 
450 iSii 450 
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Cys Val Arg Leu Gly Pro Thr Gly Trp Cys Asn Thr Asp Leu Cys Ser 
465 470 475 480 

Ala Pro His Ser Tyr Val Cys Glu Leu Gin Pro Gly Gly Pro Val Gin 
485 490 495 

Asp Ala Glu Asn Leu Leu Val Gly Ala Pro Ser Gly Asp Leu Gin Gly 
500 505 510 

Pro Leu Thr Pro Leu Ala Gin Gin Asp Gly Leu Ser Ala Pro His Glu 
515 520 525 

Pro Val Glu Val Met Val Phe Pro Gly Leu Arg Leu Ser Arg Glu Ala 
530 535 540 

Phe Leu Thr Thr Ala Glu Phe Gly Thr Gin Glu leu Arg Arg Pro Ala 
545 550 555 ^ 560 

Gin Leu Arg Leu Gin Val Tyr Arg Leu Leu Ser Thr Ala Gly Thr Pro 
565 570 575 

Glu Asn Gly Ser Glu Pro Glu Ser Arg Ser Pro Asp Asn Arg Thr Gin 
580 585 ^ 590 

Leu Ala Pro Ala Cys Met Pro Gly Gly Arg Trp Cys Pro Gly Ala Asn 
595 600 605 

lie Cys Leu Pro Leu Asp Ala Ser Cys His Pro Gin Ala Cys Ala Asn 
610 615 620 

Gly Cys Thr Ser Gly Pro Gly Leu Pro Gly Ala Pro Tyr Ala Leu Trp 
625 630 635 640 

Arg Glu Phe Leu Phe Ser Val Ala Ala Gly Pro Pro Ala Gin Tyr Ser 
645 650 655 

Val Tfrnr Leu His Gly Gin Asp Val Leu Met Leu Pro Gly Asp Leu Val 
660 665 670 

Gly Leu Gin His Asp Ala Gly Pro Gly Ala Leu Leu His Cys Ser Pro 
675 680 685 

Ala Pro Gly His Pro Gly Pro Gin Ala Pro Tyr Leu Ser Ala Asn Ala 
690 ~ 695 700 

Ser Ser Trp Leu Pro His Leu Pro Ala Gin Leu Glu Gly Thr Trp Ala 
705 710 715 ^ 720 

Cys Pro Ala Cys Ala Leu Arg Leu Leu Ala Ala Thr Glu Gin Leu Thr 
725 730 735 

Val Leu Leu Gly Leu Arg Pro Asn Pro Gly Leu Arg Met Pro Gly Arg 
740 745 750 

Tyr Glu Val Arg Ala Glu Val Gly Asn Gly Val Ser Arg His Asn Leu 
755 760 765 

Ser Cys Ser Phe Asp Val Val Ser Pro Val Ala Gly Leu Arg Val lie 
770 775 780 

Tyr Pro Ala Pro Arg Asp Gly Arg Leu Tyr Val Pro Thr Asn Gly Ser 
785 790 795 800 
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Ala Leu Val Leu Gin Val Asp Ser Gly Ala Asn Ala Thr Ala Thr Ala 
805 810 815 

Arg Trp Pro Gly Gly Ser Val Ser Ala Arg Phe Glu Asn Val Cys Pro 
820 825 830 

Ala Leu Val Ala Thr Phe Val Pro Gly Cys Pro Trp Glu Thr Asn Asp 
835 840 845 

Thr Leu Phe Sex Val Val Ala Leu Pro Trp Leu Ser Glu Gly Glu His 
850 855 860 

Val Val Asp Val Val Val Glu Asn Ser Ala Ser Arg Ala Asn Leu Ser 
865 870 875 880 

Leu Arg Val Thr Ala Glu Glu Pro lie Cys Gly Leu Arg Ala Thr Pro 
885 890 ~ 895 

Ser Pro Glu Ala Arg Val Leu Gin Gly Val Leu Val Arg Tyr Ser Pro 
900 905 910 

Val Val Glu Ala Gly Ser Asp Met Val Phe Arg Trp Thr lie Asn Asp 
915 920 925 

Lys Gin Ser Leu Thr Phe Gin Asn Val Val Phe Asn Val lie Tyr Gin 
930 935 940 

Ser Ala Ala Val Phe Lys Leu Ser Leu Thr Ala Ser Asn His Val Ser 
945 950 955 960 

Asn Val Thr Val Asn Tyr Asn Val Thr Val Glu Arg Met Asn Arg Met 
965 970 975 

Gin Gly Leu Gin Val Ser Thr Val Pro Ala Val Leu Ser Pro Asn Ala 
980 985 990 

Thr Leu Val Leu Ttrr Gly Gly Val Leu Val Asp Ser Ala Val Glu Val 
995 1000 1005 

Ala Phe Leu Trp Asn Phe Gly Asp Gly Glu Gin Ala Leu His Gin Phe 
1010 1015 1020 

Gin Pro Pro Tyr Asn Glu Ser Phe Pro Val Pro Asp Pro Ser Val Ala 
1025 1030 1035 1040 

Gin Val Leu Val Glu His Asn Val Met His Thr Tyr Ala Ala Pro Gly 
1045 1050 1055 

Glu Tyr Leu Leu Thr Val Leu Ala Ser Asn Ala Phe Glu Asn Leu Thr 
1060 1065 1070 

Gin Gin Val Pro Val Ser Val Arg Ala Ser Leu Pro Ser Val Ala Val 
1075 1080 1085 

Gly Val Ser Asp Gly Val Leu Val Ala Gly Arg Pro Val Thr Phe Tyr 
1090 1095 1100 

Pro His Pro Leu Pro Ser Pro .Gly Gly Val Leu Tyr Thr Trp Asp Phe 
1105 1110 1115 1120 

Gly Asp Gly Ser Pro Val Leu Thr Gin Ser Gin Pro Ala Ala Asn His 
1125 1.1.30 1135 
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Thr Tyr Ala Ser Arg Gly Thr Tyr His Val Arg Leu Glu Val Asn Asn 
1140 "* 1145 1150 

Thr Val Ser Gly Ala Ala Ala Gin Ala Asp Val Arg Val Phe Glu Glu 
1155 1160 1165 

Leu Arg Gly Leu Ser Val Asp Met: Ser Leu Ala Val Glu Gin Gly Ala 
1170 1175 1180 

Pro Val Val Val Ser Ala Ala Val Gin Thr Gly Asp Asn lie Thr Trp 
1185 1190 1195 1200 

Thr Phe Asp Met: Gly Asp Gly Thr Val Leu Ser Gly Pro Glu Ala Thr 
1205 1210 ~ 1215 

Val Glu His Val Tyr Leu Arg Ala Gin Asn Cys Thr Val Thr Val Gly 
1220 1225 1230 

Ala Ala Ser Pro Ala Gly His Leu Ala Arg Ser Leu His Val Leu Val 
1235' 1240 1245 

Phe Val Leu Glu Val Leu Arg Val Glu Pro Ala Ala Cys lie Pro Thr 
1250 1255 1260 

Gin Pro Asp Ala Arg Leu Thr Ala Tyr Val Thr Gly Asn Pro Ala His 
1265 1270 1275 1280 

Tyr Leu Phe Asp Trp Thr Phe Gly Asp Gly Ser Ser Asn Thr Thr Val 
1285 1290 1295 

Arg Gly Cys Pro Thr Val Thr His Asn Phe Thr Arg Ser Gly Thr Phe 
1300 1305 ~ 1310 

Pro Leu Ala Leu Val Leu Ser Ser Arg Val Asn Arg Ala His Tyr Phe 
1315 1320 1325 

Thr Ser lie Cys Val Glu Pro Glu Val Gly Asn Val Thr Leu Gin Pro 
1330 1335 1340 

Glu Arg Gin Phe Val Gin Leu Gly Asp Glu Ala Trp Leu Val Ala Cys 
1345 1350 1355 1360 

Ala Trp Pro Pro Phe Pro Tyr Arg Tyr Thr Trp Asp Phe Gly Thr Glu 
1365 1370 ~ 1375 

Glu Ala Ala Pro Thr Arg Ala Arg Gly Pro Glu Val Thr Phe lie Tyr 
1380 1385 1390 

Arg Asp Pro Gly Ser Tyr Leu Val Thr Val Thr Ala Ser Asn Asn lie 
1395 ~ 1400 1405 

Ser Ala Ala Asn Asp Ser Ala Leu Val Glu Val Gin Glu Pro Val Leu 
1410 1415 1420 

Val Thr Ser lie Lys Val Asn Gly Ser Leu Gly Leu Glu Leu Gin Gin 
1425 1430 1435 1440 

Pro Tyr Leu Phe Ser Ala Val Gly Arg Gly Arg Pro Ala Ser Tyr Leu 
1445 1450 1455 
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Trp Asp Leu Gly Asp Gly Gly Trp Leu Glu Gly Pro Glu Val Thr His 
1460 1465 1470 

Ala Tyr Asn Ser Thr Gly Asp Phe Thr Val Arg Val Ala Gly Trn Asn 
1475 1480 " 1485 

Glu Val Ser Arg Ser Glu Ala Trp Leu Asn Val Thr Val Lys Ara Ara 
1490 1495 1500 

Val Arg Gly Leu Val Val Asn Ala Ser Arg Thr Val Val Pro Leu Asn 
1505 1510 1515 i§2 0 

Gly Ser Val Ser Phe Ser Thr Ser Leu Glu Ala Gly Ser Asp Val Ara 
1525 1530 ~ 1535 

Tyr Ser Trp Val Leu Cys Asp Arg Cys Thr Pro lie Pro Gly Glv Pro 
1540 1545 1550 

Thr lie Ser Tyr Thr Phe Arg Ser Val Gly Thr Phe Asn He He Val 
1555 1560 1565 

Thr Ala Glu Asn Glu Val Gly Ser Ala Gin Asp Ser He Phe Val Tvr 
1570 1575 1580 

Val leu Gin Leu lie Glu Gly Leu Gin Val Val Gly Gly Gly Ara Tvr 
1585 1590 1595 1600 

Phe Pro Thr Asn His Thr Val Gin Leu Gin Ala Val Val Ara Asd Glv 
1605 1610 9 1615 

Thr Asn Val Ser Tyr Ser Trp Thr Ala Trp Arg Asp Arg Gly Pro Ala 
1620 1625 . 1630 

Leu Ala Gly Ser Gly Lys Gly Phe Ser Leu Thr Val Leu Glu Ala Glv 
1635 1640 1645 

Thr Tyr His Val Gin Leu Arg Ala Thr Asn Met Leu Gly Ser Ala Tro 
1650 1655 1660 

Ala Asp Cys Thr Met Asp Phe Val Glu Pro Val Gly Trp Leu Met Val 
1665 1670 1675 1680 

Thr Ala Ser Pro Asn Pro Ala Ala Val Asn Thr Ser Val Thr Leu Ser 
1685 1690 1695 

Ala Glu Leu Ala Gly Gly Ser Gly Val Val Tyr Thr Trp Ser Leu Glu 
1700 1705 1710 

Glu Gly Leu Ser Trp Glu Thr Ser Glu Pro Phe Thr Thr His Ser Phe 
1715 1720 1725 

Pro Thr Pro Gly leu His leu Val Thr Met Thr Ala Gly Asn Pro Leu 
1730 1735 1740 

Gly Ser Ala Asn Ala Thr Val Glu Val Asp Val Gin Val Pro Val Ser 
1745 1750 1755 i7 6 o 

Gly Leu Ser He Arg Ala Ser Glu Pro Gly Gly Ser Phe Val Ala Ala 
1765 1770 1775 
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Gly Ser Ser Val Pro Phe Trp Gly Gin Leu Ala Thr Gly Thr Asn Val 
1780 1785 1790 

Ser Trp Cys Trp Ala Val Pro Gly Gly Ser Ser Lys Arg Gly Pro His 
1795 1800 1805 

Val Thr Met: Val Phe Pro Asp Ala Gly Thr Phe Ser lie Arg Leu Asn 
1810 1815 1820 

Ala Ser Asn Ala Val Ser Trp Val Ser Ala Thr Tyr Asn Leu Thr Ala 
1825 1830 1835 1840 

Glu Glu Pro He Val Gly Leu Val Leu Trp Ala Ser Ser Lys Val Val 
1845 1850 1855 

Ala Pro Gly Gin Leu Val His Phe Gin He Leu Leu Ala Ala Gly Ser 
1860 1865 1870 

Ala Val Thr Phe Arg Leu Gin Val Gly Gly Ala Asn Pro Glu Val Leu 
1875 1880 1885 

Pro Gly Pro Arg Phe Ser His Ser Phe Pro Arg Val Gly Asp His Val 
1890 " 1895 1900 

Val Ser Val Arg Gly Lys Asn His Val Ser Trp Ala Gin Ala Gin Val 
1905 1910 1915 1920 

Arg He Val Val Leu Glu Ala Val Ser Gly Leu Gin Met: Pro Asn Cys 
1925 1930 1935 

Cys Glu Pro Gly He Ala Thr Gly Thr Glu Arg Asn Phe Thr Ala Arg 
1940 1945 1950 

Val Gin Arg Gly Ser Arg Val Ala Tyr Ala Trp Tyr Phe Ser Leu Gin 
1955 1960 1965 

Lys Val Gin Gly Asp Ser Leu Val He Leu Ser Gly Arg Asp Val Thr 
1970 1975 1980 

Tyr Thr Pro Val Ala Ala Gly Leu Leu Glu lie Gin Val Arg Ala Phe 
1985 1990 1995 2000 

Asn Ala Leu Gly Ser Glu Asn Arg Thr Leu Val Leu Glu Val Gin Asp 
2005 2010 2015 

Ala Val Gin Tyr Val Ala Leu Gin Ser Gly Pro Cys Phe Thr Asn Arg 
2020 2025 2030 

Ser Ala Gin Phe Glu Ala Ala Thr Ser Pro Ser Pro Arg Arg Val Ala 
2035 2040 2045 

Tyr His Trp Asp Phe Gly Asp Gly Ser Pro Gly Gin Asp Thr Asp Glu 
2050 2055 2060 

Pro Arg Ala Glu His Ser Tyr Leu Arg Pro Gly Asp Tyr Arg Val Gin 
2065 2070 2075 ~ 2080 

Val Asn Ala Ser Asn Leu Val Ser Phe Phe Val Ala Gin Ala Thr Val 
2085 2090 2095 
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Thr Val Gin Val Leu Ala Cys Arg Glu Pro Glu Val Asp Val Val Leu 
2100 2105 2110 

Pro Leu Gin Val Leu Met Arg Arg Ser Gin Arg Asn Tyr Leu Glu Ala 
2115 2120 ~ 2125 

His Val Asp Leu Arg Asp Cys Val Thr Tyr Gin Thr Glu Tyr Aro Trr> 
2130 2135 2140 

Glu Val Tyr Arg Thr Ala Ser Cys Gin Arg Pro Gly Arg Pro Ala Aro 
2145 2150 2155 2160 

Val Ala Leu Pro Gly Val Asp Val Ser Arg Pro Arg Leu Val Leu Pro 
2165 2170 2175 

Arg Leu Ala Leu Pro Val Gly His Tyr Cys Phe Val Phe Val Val Ser 
2180 2185 2190 

Phe Gly Asp Thr Pro Leu Thr Gin Ser lie Gin Ala Asn Val Thr Val 
2195 2200 2205 

Ala Pro Glu Arg Leu Val Pro He He Glu Gly Gly Ser Tyr Arg Val 
2210 2215 2220 

Trp Ser Asp Thr Arg Asp Leu Val Leu Asp Gly Ser Glu Ser Tyr Asp 
2225 2230 2235 2240 

Pro Asn Leu Glu Asp Gly Asp Gin Thr Pro Leu Ser Phe His Trp Ala 
2245 2250 2255 

Cys Val Ala Ser Thr Gin Arg Glu Ala Gly Gly Cys Ala Leu Asn Phe 
2260 2265 ~ 2270 

Gly Pro Arg Gly Ser Ser Thr Val Thr He Pro Arg Glu Ara Leu Ala 
2275 2280 2285 

Ala Gly Val Glu Tyr Thr Phe Ser Leu Thr Val Trp Lys Ala Gly Ara 
2290 2295 2300 

Lys Glu Glu Ala Thr Asn Gin Thr Val Leu He Arg Ser Gly Aro Val 
2305 2310 2315 2320 

Pro He Val Ser Leu Glu Cys Val Ser Cys Lys Ala Gin Ala Val Tyr 
2325 2330 2335 

Glu Val Ser Arg Ser Ser Tyr Val Tyr Leu Glu Gly Arg Cys Leu Asn 
2340 2345 ~ 2350 

Cys Ser Ser Gly Ser Lys Arg Gly Arg Trp Ala Ala Arg Thr Phe Ser 
2355 2360 2365 

Asn Lys Thr Leu Val Leu Asp Glu Thr Thr Thr Ser Thr Gly Ser Ala 
2370 2375 2380 

.Gly Met Arg leu Val Leu Arg Arg Gly Val Leu Arg Asp Gly Glu Glv 
2385 2390 2395 2400 

Tyr Thr Phe Thr Leu Thr Val Leu Gly Arg Ser Gly Glu Glu Glu Gly 
2405 2410 " 2415 
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Cys Ala Ser lie Arg Leu Ser Pro Asn Arg Pro Pro Leu Gly Gly Ser 
2420 2425 2430 

Cys Arg Leu Phe Pro Leu Gly Ala Val His Ala Leu Thr Thr Lys Val 
2435 2440 2445 

His Phe Glu Cys Thr Gly Trp His Asp Ala Glu Asp Ala Gly Ala Pro 
2450 2455 2460 

Leu Val Tyr Ala Leu Leu Leu Arg Arg Cys Arg Gin Gly His Cys Glu 
2465 2470 ~ 2475 ~ 2480 

Glu Phe Cys Val Tyr Lys Gly Ser Leu Ser Ser Tyr Gly Ala Val Leu 
2485 2490 " 2495 

Pro Pro Gly Phe Arg Pro His Phe Glu Val Gly Leu Ala Val Val Val 
2500 2505 2510 

Gin Asp Gin Leu Gly Ala Ala Val Val Ala Leu Asn Arg Ser Leu Ala 
2515 2520 2525 

lie Thr Leu Pro Glu Pro Asn Gly Ser Ala Thr Gly Leu Thr Val Trp 
2530 2535 2540 

Leu His Gly Leu Thr Ala Ser Val Leu Pro Gly Leu Leu Arg Gin Ala 
2545 2550 2555 ^ 2560 



Asp Pro Gin His Val lie Glu Tyr Ser Leu Ala Leu Val Thr Val Leu 
2565 2570 2575 

Asn Glu Tyr Glu Arg Ala Leu Asp Val Ala Ala Glu Pro Lys His Glu 
2580 2585 2590 

Arg Gin His Arg Ala Gin lie Arg Lys Asn lie Thr Glu Thr Leu Val 
2595 2600 2605 

Ser Leu Arg Val His Thr Val Asp Asp lie Gin Gin lie Ala Ala Ala 
2610 2615 2620 

Leu Ala Gin Cys Met Gly Pro Ser Arg Glu Leu Val Cys Arg Ser Cys 
2625 2630 2635 2640 

Leu Lys Gin Thr Leu His Lys Leu Glu Ala Met Met Leu lie Leu Gin 
2645 2650 2655 

Ala Glu Thr Thr Ala Gly Thr Val Thr Pro Thr Ala lie Gly Asp Ser 
2660 2665 2670 

lie Leu Asn lie Thr Glv Asp Leu lie His Leu Ala Ser Ser Asp Val 
2675 2680 2685 

Arg Ala Pro Gin Pro Ser Glu Leu Gly Ala Glu Ser Pro Ser Arg Met 
2690 2695 2700 

Val Ala Ser Gin Ala Tyr Asn Leu Thr Ser Ala Leu Met Arg lie Leu 
2705 2710 2715 2720 

Met Arg Ser Arg Val Leu Asn Glu Glu Pro Leu Thr Leu Ala Gly Glu 
2725 2730 2735 
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Glu lie Val Ala Gin Gly Lys Arg Ser Asp Pro Arg Ser Leu Leu Cys 
2740 2745 ~ 2750 

Tyr Gly Gly Ala Pro Gly Pro Gly Cys His Phe Ser lie Pro Glu Ala 
2755 2760 2765 

Phe Ser Gly Ala Leu Ala Asn Leu Ser Asp Val Val Gin Leu lie Phe 
2770 2775 2780 

Leu Val Asp Ser Asn Pro Phe Pro Phe Gly Tyr He Ser Asn Tyr Thr 
2785 2790 2795 2800 

Val Ser Thr Lys Val Ala Ser Met Ala Phe Gin Thr Gin Ala Gly Ala 
2805 2810 2815 

Gin He Pro He Glu Arg Leu Ala Ser Glu Arg Ala He Thr Val Lys 
2820 2825 ' 2830 

Val Pro Asn Asn Ser Asp Trp Ala Ala Arg Gly His Arg Ser Ser Ala 
2835 2840 ~ 2845 

Asn Ser Ala Asn Ser Val Val Val Gin Pro Gin Ala Ser Val Gly Ala 
2850 2855 2860 

Val Val Thr Leu Asp Ser Ser Asn Pro Ala Ala Gly Leu His Leu Gin 
2865 2870 2875 2880 

Leu Asn Tyr Thr Leu Leu Asp Gly His Tyr Leu Ser Glu Glu Pro Glu 
2885 2890 2895 

Pro Tyr Leu Ala Val Tyr Leu His Ser Glu Pro Arg Pro Asn Glu His 
2900 2905 " 2910 

Asn Cys Ser Ala Ser Arg Arg He Arg Pro Glu Ser Leu Gin Gly Ala 
2915 2920 2925 

Asp His Arg Pro Tyr Thr Phe Phe He Ser Pro Gly Ser Ara Asn Pro 
2930 2935 2940 

Ala Gly Ser Tyr His Leu Asn Leu Ser Ser His Phe Arg Trr> Ser Ala 
2945 2950 2955 2960 

Leu Gin Val Ser Val Gly Leu Tyr Thr Ser Leu Cys Gin Tyr Phe Ser 
2965 2970 2975 

Glu Glu Asp Met Val Trp Arg Thr Glu Gly Leu Leu Pro Leu Glu Glu 
2980 2985 2990 

Thr Ser Pro Arg Gin Ala Val Cys Leu Thr Arg His Leu Thr Ala Phe 
2995 3000 ~ 3005 

Gly Ala Ser Leu Phe Val Pro Pro Ser His Val Arg Phe Val Phe Pro 
3010 3015 3020 

Glu Pro Thr Ala Asp Val Asn Tyr He Val Met Leu Thr Cys Ala Val 
3025 3030 3035 3040 

Cys Leu Val Thr Tyr Met Val Met Ala Ala He Leu His Lys Leu Asn 
3045 3050 3055 
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Gin Leu Asp Ala Ser Arg Gly Arg Ala lie Pro Phe Cys Gly Gin Arg 
3060 3065 3070 

Gly Arg Phe Lys Tyr Glu lie Leu Val Lys Thr Gly Trp Gly Arg Gly 
3075 - 3080 3085 

Ser Gly Thr Thr Ala His Val Gly lie Met Leu Tyr Gly Val Asp Ser 
3090 3095 3100 

Arg Ser Gly His Arg His Leu Asp Gly Asp Arrr Ala Phe His Arg Asn 
3105 3110 3115 3120 

Ser Leu Asp lie Phe Arg lie Ala Thr Pro His Ser Leu Gly Ser Val 
3125 3130 3135 

Trp Lys lie Arg Val Trp His Asp Asn Lys Gly Leu Ser Pro Ala Trp 
3140 3145 3150 

Phe Leu Gin His Val lie Val Arg Asp Leu Gin Thr Ala Arg Ser Ala 
3155 3160 3165 

Phe Phe Leu Val Asn Asp Trp Leu Ser Val Glu Thr Glu Ala Asn Gly 
3170 3175 3180 

Gly Leu Val Glu Lys Glu Val Leu Ala Ala Ser Asp Ala Ala Leu Leu 
3185 3190 3195 3200 

Arg Phe Arg Arg Leu Leu Val Ala Glu Leu Gin Arg Gly Phe Phe Asp 
3205 3210 3215 

Lys His lie Trp Leu Ser lie Trp Asp Arg Pro Pro Arg Ser Arg Phe 
3220 3225 3230 

Thr Arg lie Gin Arg Ala Thr Cys Cys Val Leu Leu lie Cys Leu Phe 
3235 3240 3245 

Leu Gly Ala Asn Ala Val Trp Tyr Gly Ala Val Gly Asp Ser Ala Tyr 
3250 3255 3260 

Ser Thr Gly His Val Ser Arg Leu Ser Pro Leu Ser Val Asp Thr Val 
3265 3270 3275 3280 

Ala Val Gly Leu Val Ser Ser Val Val Val Tyr Pro Val Tyr Leu Ala 
3285 3290 3295 

lie Leu Phe Leu Phe Arg Met Ser Arg Ser Lys Val Ala Gly Ser Pro 
3300 ^ 3305 3310 

Ser Pro Thr Pro Ala Gly Gin Gin Val Leu Asp lie Asp Ser Cys Leu 
3315 3320 3325 

Asp Ser Ser Val Leu Asp Ser Ser Phe Leu Thr Phe Ser Gly Leu His 
3330 3335 3340 

Ala Glu Ala Phe Val Gly Gin Met Lys Ser Asp Leu Phe Leu Asp Asp 
3345 3350 3355 3360 

Ser Lys Ser Leu Val Cys Trp Pro Ser Gly Glu Gly Thr Leu Ser Trp 
3365 3370 3375 
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Pro Asp Leu Leu Ser Asp Pro Ser lie Val Gly Ser Asn Leu Arg Gin 
3380 3385 3390 

Leu Ala Arg Gly Gin Ala Gly His Gly Leu Gly Pro Glu Glu Asp Gly 
3395 3400 3405 

Fhe Ser Leu Ala Ser Pro Tyr Ser Pro Ala Lys Ser Phe Ser Ala Ser 
3410 3415 3420 

Asp Glu Asp Leu He Gin Gin Val Leu Ala Glu Gly Val Ser Ser Pro 
3425 3430 3435 3440 

Ala Pro Thr Gin Asp Thr His Met: Glu Thr Asp Leu Leu Ser Ser Leu 
3445 3450 3455 

Ser Ser Thr Pro Gly Glu Lys Thr Glu Thr Leu Ala Leu Gin Arg Leu 
3460 3465 3470 

Gly Glu Leu Gly Pro Pro Ser Pro Gly Leu Asn Trp Glu Gin Pro Gin 
3475 3480 3485 

Ala Ala Arg Leu Ser Arg Thr Gly Leu Val Glu Gly Leu Arg Lys Arg 
3490 3495 3500 

Leu Leu Pro Ala Trp Cys Ala Ser Leu Ala His Gly Leu Ser Leu Leu 
3505 3510 3515 3520 

Leu Val Ala Val Ala Val Ala Val Ser Gly Trp Val Gly Ala Ser Phe 
3525 3530 3535 

Pro Pro Gly Val Ser Val Ala Trp Leu Leu Ser Ser Ser Ala Ser Phe 
3540 3545 3550 

Leu Ala Ser Phe Leu Gly Trp Glu Pro Leu Lys Val Leu Leu Glu Ala 
3555 3560 3565 

Leu Tyr Phe Ser Leu Val Ala Lys Arg Leu His Pro Asp Glu Asp Asp 
3570 3575 3580 



Thr Leu Val Glu Ser Pro Ala Val Thr Pro Val Ser Ala Arg Val Pro 
3585 3590 3595 ~ 3600 

Arg Val Arg Pro Pro His Gly Phe Ala Leu Phe Leu Ala Lys Glu Glu 
3605 3610 3615 

Ala Arg Lys Val Lys Arg Leu His Gly Met Leu Arg Ser Leu Leu Val 
3620 3625 3630 

Tyr Met Leu Phe Leu Leu Val Thr Leu Leu Ala Ser Tyr Gly Asp Ala 
3635 3640 3645 

Ser Cys His Gly His Ala Tyr Arg Leu Gin Ser Ala lie Lys Gin Glu 
3650 3655 3660 

Leu His Ser Arg Ala Phe Leu Ala lie Thr Arg Ser Glu Glu Leu Trp 
3665 ^ 3670 3675 3680 

Pro Trp Met Ala His Val Leu Leu Pro Tyr Val His Gly Asn Gin Ser 
3685 3690 "* 3695 
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Ser Pro Glu Leu Gly Pro Pro Arg Leu Arg Gin Val Arg Leu Gin Glu 
3700 3705 3710 

Ala Leu Tyr Pro Asp Pro Pro Gly Pro Arg Val His Thr Cys Ser Ala 
3715 3720 3725 

Ala Gly Gly Phe Ser Thr Ser Asp Tyr Asp Val Gly Trp Glu Ser Pro 
3730 3735 3740 

His Asn Gly Ser Gly Thr Trp Ala Tyr Ser Ala Pro Asp Leu Leu Gly 
3745 3750 3755 3760 

Ala Trp Ser Trp Gly Ser Cys Ala Val Tyr Asp Ser Gly Gly Tyr Val 
3765 3770 3775 

Gin Glu Leu Gly Leu Ser Leu Glu Glu Ser Arg Asp Arg Leu Arg Hie 
3780 3785 3790 

Leu Gin Leu His Asn Trp Leu Asp Asn Arg Ser Arg Ala Val Phe Leu 
3795 3800 ~ 3805 

Glu Leu Thr Arg Tyr Ser Pro Ala Val Gly Leu His Ala Ala Val Thr 
3810 w 3815 3820 

Leu Arg Leu Glu Hie Pro Ala Ala Gly Arg Ala Leu Ala Ala Leu Ser 
3825 3830 3835 3840 

Val Arg Pro Phe Ala Leu Arg Arg Leu Ser Ala Gly Leu Ser Leu Pro 
3845 " 3850 3855 

Leu Leu Thr Ser Val Cys Leu Leu Leu Phe Ala Val His Phe Ala Val 
3860 3865 3870 

Ala Glu Ala Arg Thr Trp His Arg Glu Gly Arg Trp Arg Val Leu Arg 
3875 3880 3885 

Leu Gly Ala Trp Ala Arg Trp Leu Leu Val Ala Leu Thr Ala Ala Thr 
3890 3895 3900 

Ala Leu Val Arg Leu Ala Gin Leu Gly Ala Ala Asp Arg Gin Trp Thr 
3905 3910 3915 ~ 3920 

Arg Phe Val Arg Gly Arg Pro Arg Arg Hie Thr Ser Hie Asp Gin Val 
3925 3930 3935 

Ala His Val Ser Ser Ala Ala Arg Gly Leu Ala Ala Ser Leu Leu Hie 
3940 3945 3950 

Leu Leu Leu Val Lys Ala Ala Gin His Val Arg Phe Val Arg Gin Trp 
3955 3960 3965 

Ser Val Phe Gly Lys Thr Leu Cys Arg Ala Leu Pro Glu Leu Leu Gly 
3970 3975 3980 

Val Thr Leu Gly Leu Val Val Leu Gly Val Ala Tyr Ala Gin Leu Ala 
3985 3990 3995 4000 

lie Leu Leu Val Ser Ser Cys Val Asp Ser Leu Trp Ser Val Ala Gin 
4005 4010 4015 
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Ala Leu Leu Val Leu Cys Pro Gly Thr Gly Leu Ser Thr Leu Cys Pro 
4020 4025 4030 

Ala Glu Ser Trp His Leu Ser Pro Leu Leu Cys Val Gly Leu Trp Ala 
4035 4040 * 4045 

Leu Arg Leu Trp Gly Ala Leu Arg Leu Gly Ala Val lie Leu Arg Trp 
4050 4055 ^ 4060 

Arg Tyr His Ala Leu Arg Gly Glu Leu Tyr Arg Pro Ala Trp Glu Pro 
4065 4070 4075 4080 

Gin Asp Tyr Glu Met Val Glu Leu Phe Leu Arg Arg Leu Arg Leu Trp 
4085 4090 4095 

Met Gly Leu Ser Lys Val Lys Glu Phe Arg His Lys Val Arg Phe Glu 
4100 4105 4110 

Gly Met Glu Pro Leu Pro Ser Arg Ser Ser Arg Gly Ser Lys Val Ser 
4115 4120 4125 

Pro Asp Val Pro Pro Pro Ser Ala Gly Ser Asp Ala Ser His Pro Ser 
4130 4135 4140 

Thr Ser Ser Ser Gin Leu Asp Gly Leu Ser Val Ser Leu Glv Ara Leu 
4145 4150 4155 * 4160 

Gly Thr Arg Cys Glu Pro Glu Pro Ser Arg Leu Gin Ala Val Phe Glu 
4165 4170 4175 

Ala I^u Leu Thr Gin Phe Asp Arg Leu Asn Gin Ala Thr Glu Asp Val 
4180 4185 4190 

Tyr Gin Leu Glu Gin Gin Leu His Ser Lai Gin Gly Arg Arg Ser Ser 
4195 4200 4205 

Arg Ala Pro Ala Gly Ser Ser Arg Gly Pro Ser Pro Gly Leu Ara Pro 
4210 4215 4220 

Ala Leu Pro Ser Arg Leu Ala Arg Ala Ser Arg Gly Val Asp Leu Ala 
4225 4230 4235 4240 

Thr Gly Pro Ser Arg Thr Pro Ser Gly Gin Glu Gin Gly Pro Pro Gin 
4245 4250 4255 

Gin His Leu Val Leu Leu Pro Gly Gly Gly Gly Pro Trp Ser Arg Ser 
4260 4265 * 4270 

Gly His Arg Ser Val Leu Leu Ser Ala Ala Val Lys Ala Glu Gly Gin 
4275 4280 4285 

Ala Glu Trp Leu His Val Gly Ser Pro Glu Ser Arg Gin Gly His Leu 
4290 4295 4300 ' 

Ser Val Cys Gly Leu Gin His Phe Lys Glu Ala Val Trp Pro Thr Ara 
4305 4310 4315 4320 

Thr Gin Gly Pro Leu Pro Ser Ser Leu Gly Lys Asp Thr Ala Val Leu 
4325 4330 4335 

Asp Gly Phe 
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(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 3: (Compare Figure 7) 

CIC AAC GAG GAG COC CTG AOG CTG GOG GGC GAG GAG ATC CTG GOC CAG 48 
Leu Asn Glu Glu Pro Leu Thr Leu Ala Gly Glu Glu lie Val Ala Gin 
4340 4345 4350 4355 

GGC AAG CGC TOG GAC COG CGG AGC CTG CTG TGC TAT GGC GGC GOC CCA 96 
Gly Lys Arg Ser Asp Pro Arg Ser Leu Leu Cys Tyr Gly Gly Ala Pro 
4360 4365 4370 

GGG OCT GGC TGC CAC TIC TOC ATC COC GAG GCT TIC AGC GGG GOC CTG 144 
Gly Pro Gly Cys His Phe Ser lie Pro Glu Ala Phe Ser Gly Ala Leu 
4375 4380 4385 

GOC AAC CTC ACT GAC GTG GTG CAG CIC ATC TIT CTG GTG GAC TOC AAT 192 
Ala Asn Leu Ser Asp Val Val Gin Leu lie Phe Leu Val Asp Ser Asn 
4390 4395 4400 

COC TTT COC TTT GGC TAT ATC AGC AAC TAC AOC GIC TOC AOC AAG GTG 240 
Pro Phe Pro Phe Gly Tyr lie Ser Asn Tyr Thr Val Ser Thr Lys Val 
4405 4410 4415 

GOC TOG ATG GCA TIC CAG ACA CAG GOC GGC GOC CAG ATC GOC ATC GAG 288 
Ala Ser Met: Ala Phe Gin Thr Gin Ala Gly Ala Gin lie Pro lie Glu 
4420 4425 4430 4435 

CGG CTG GOC TCA GAG CGC GOC ATC AOC GTG AAG GTG COC AAC AAC TOG 336 
Arg Leu Ala Ser Glu Arg Ala lie Thr Val Lys Val Pro Asn Asn Ser 
4440 4445 4450 

GAC TOG GCT GOC CGG GGC CAC CGC AGC TOC GOC AAC TOC GOC AAC TOC 384 
Asp Trp Ala Ala Arg Gly His Arg Ser Ser Ala Asn Ser Ala Asn Ser 
4455 4460 4465 

GTT GTG GIC CAG COC CAG GOC TOC QIC GGT GCT GTG GTC AOC CTG GAC 432 
Val Val Val Gin Pro Gin Ala Ser Val Gly Ala Val Val Thr Leu Asp 
4470 4475 4480 

AGC AGC AAC OCT GOG GOC GGG CTG CAT CTG CAG CTC AAC TAT AOG CTG 480 
Ser Ser Asn Pro Ala Ma Gly Leu His Leu Gin Leu Asn Tyr Thr Leu 
4485 4490 4495 

CTG GAC GGC CAC TAC CTG TCT GAG GAA OCT GAG COC TAC CIC GCA CTC 528 
Leu Asp Gly His Tyr Leu Ser Glu Glu Pro Glu Pro Tyr Leu Ala Val 
4500 4505 4510 4515 

TAC CTA CAC TOG GAG OOC CGG COC AAT GAG CAC AAC TGC TOG GCT AGC 576 
Tyr Leu His Ser Glu Pro Arg Pro Asn Glu His Asn Cys Ser Ala Ser 
4520 ~ 4525 4530 

AGG AGG ATC CGC CCA GAG TCA CIC CAG GGT GCT GAC CAC CGG COC TAC 624 
Arg Arg lie Arg Pro Glu Ser Leu Gin Gly Ala Asp His Arg Pro Tyr 
4535 4540 4545 

AOC TTC TIC ATT TOC COG GGG AGC AGA GAC CCA GOG GGG ACT TAC CAT 672 
Thr Phe Phe lie Ser Pro Gly Ser Arg Asp Pro Ala Gly Ser Tyr His 
4550 4555 4560 

CTG AAC CIC TOC AGC CAC TIC CGC TGG TOG GOG CTG CAG GTG TOC GTG 720 
Leu Asn Leu Ser Ser His Piie Arg Trp Ser Ala Leu Gin Val Ser Val 
4565 4570 4575 
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GGC CTG TAC AOG TOC CTG TGC CAG TAC TTC AGC GAG GAG GAC ATG GTG 768 
Gly Leu Tyr Thr Ser Leu Cys Gin Tyr Phe Ser Glu Glu Asp Met Val 
4580 4585 4590 4595 

TGG OGG ACA GAG GOG CTG CTG OOC CTG GAG GAG ACC TOG COC CGC CAG 816 
Trp Arg Thr Glu Gly Leu Leu Pro leu Glu Glu Thr Ser Pro Arg Gin 
4600 4605 4610 

GOC GTC TGC CTC ACC CGC CAC CTC AOC GOC TTC GGC GOC AGC CTC TTC 864 
Ala Val Cys Leu Thr Arg His Leu Thr Ala Phe Gly Ala Ser Leu Phe 
4615 4620 4625 

GIG COC OCA AGC CAT CTC OQC TTT GIG TIT OCT GAG COG ACA GOG GAT 912 
Val Pro Pro Ser His Val Arg Phe Val Phe Pro Glu Pro Thr Ala Asp 
4630 4635 4640 

CTA AAC TAC ATC GTC ATG CTG ACA TCT GOT GTG TGC CTG GTG ACC TAC 960 
Val Asn Tyr lie Val Met: Leu Thr Cys Ala Val Cys Leu Val Thr Tyr 
4645 4650 4655 

ATG GTC ATG GOC GOC ATC CTG CAC AAG CTG GAC CAG TTG GAT GOC AGC 1008 
Met Val Met Ala Ala lie Leu His Lys Leu Asp Gin Leu Asp Ala Ser 
4660 4665 4670 4675 

OGG GOC CGC GOC ATC OCT TTC TGT GGG CAG OGG GGC CGC TTC AAG TAC 1056 
Arg Gly Arg Ala lie Pro Phe Cys Gly Gin Arg Gly Arg Phe Lys Tyr 
4680 4685 ** 4690 

GAG ATC CTC GTC AAG ACA GGC TGG GGC OGG GGC TCA GGT ACC AOG GOC 1104 
Glu lie Leu Val Lys Thr Gly Trp Gly Arg Gly Ser Gly Thr Thr Ala 
4695 4700 4705 

CAC GTG GGC ATC ATG CTG TAT GGG GTG GAC AGC OGG AGC GGC CAC GGG 1152 
His Val Gly lie Met Leu Tyr Gly Val Asp Ser Arg Ser Gly His Arg 
4710 4715 4720 

CAC CTG GAC GGC GAC AGA GOC TTC CAC GGC AAC AGC CTG GAC ATC TTC 1200 
His Leu Asp Gly Asp Arg Ala Phe His Arg Asn Ser Leu Asp lie Phe 
4725 4730 ~ 4735 

OGG ATC GOC AOC COG CAC AGC CTG GGT AGC GTG TGG AAG ATC OGA GIG 1248 
Arg lie Ala Thr Pro His Ser Leu Gly Ser Val Trp Lys lie Arg Val 
4740 4745 4750 4755 

TGG CAC GAC AAC AAA GGG CTC AGC OCT GOC TGG TTC CTG CAG CAC GTC 1296 
Trp His Asp Asn Lys Gly Leu Ser Pro Ala Trp Phe Leu Gin His Val 
4760 4765 4770 

ATC GTC AGG GAC CTG CAG AOG GCA OGC AGC GCC TTC TTC CTG GTC AAT 1344 
lie Val Arg Asp Leu Gin Thr Ala Arg Ser Ala Phe Phe Leu Val Asn 
4775 4780 4785 

GAC TGG CTT TOG GTG GAG AOG GAG GOC AAC GGG GGC CTG GTG GAG AAG 1392 
Asp Trp Leu Ser Val Glu Thr Glu Ala Asn Gly Gly Leu Val Glu Lys 
4790 4795 4800 

GAG GTG CTG GOC GOG AGC GAC GCA GOC CTT TTG OGC TTC OGG OGC CTG 1440 
Glu Val Leu Ala Ala Ser Asp .Ala Ala Leu Leu Arg Phe Arg Arg Leu 
4805 4810 4815 

CTG GTG GCT GAG CTG CAG OCT GOC TiC TTT GAC AAG CAC ATC TGG CTC 1488 
Leu Val Ala Glu Lep f71n Arg Gly Phe Phe Asp Lys His lie Trp Leu 
4820 4825 ~ 4830 4835 
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TOC ATA TOG GAC OGG OOG OCT OCT AGC OCT TIC ACT OGC ATC CAG AGG 1536 
Ser lie Trp Asp Arg Pro Pro Arg Ser Arg Phe Thr Arg lie Gin Ang 
4840 4845 4850 

GOC AOC TGC TGC GTT CTC CTC ATC TGC CTC TIC CTG GQC GOC AAC GOC 1584 
Ala Thr Cys Cys Val Leu Leu lie Cys Leu Phe Leu Gly Ala Asn Ala 
4855 4860 4865 

GTG TGG TAG GGG GCT GIT GGC GAC TCT GOC TAG AGC AOG GGG CAT GTG 1632 
Val Trp Tyr Gly Ala Val Gly Asp Ser Ala Tyr Ser Thr Gly His Val 
4870 4875 4880 

TOC AGG CTG AGC OOG CTG AGC GIC GAC ACA GIC GCT GTT GGC CTG CTG 1680 
Ser Arg Leu Ser Pro Leu Ser Val Asp Thr Val Ala Val Gly Leu Val 
4885 4890 4895 

TOC AGC GTG GTT GIC TAT O0C GTC TAG CTG GOC ATC CTT TIT CTC TIC 1728 
Ser Ser Val Val Val Tyr Pro Val Tyr Leu Ala lie Leu Phe Leu Phe 
4900 4905 4910 4915 

OGG ATG TOC OGG AGC AAG GTG GCT GGG AGC OOG AGC 00C ACA OCT GOC 1776 
Arg Met Ser Arg Ser Lys Val Ala Gly Ser Pro Ser Pro Thr Pro Ala 
4920 4925 4930 

GGG CAG CAG GIG CTG GAC ATC GAC AGC TGC CTG GAC TOG TOC GTG CTG 1824 
Gly Gin Gin Val Leu Asp lie Asp Ser Cys Leu Asp Ser Ser Val Leu 
4935 4940 4945 

GAC AGC TOC TIC CTC AOG TTC TCA GGC CTC CAC GCT GAG GOC TIT GTT 1872 
Asp Ser Ser Phe Leu Thr Phe Ser Gly Leu His Ala Glu Ala Phe Val 
4950 4955 4960 

GGA CAG ATG AAG ACT GAC TTG TIT CTG GAT GAT TCT AAG ACT CTG GTG 1920 
Gly Gin Met Lys Ser Asp Leu Phe Leu Asp Asp Ser Lys Ser Leu Val 
4965 4970 4975 

TGC TGG OOC TOC GGC GAG GGA AOG CTC ACT TGG OOG GAC CTG CIC ACT 1968 
Cys Trp Pro Ser Gly Glu Gly Thr Leu Ser Trp Pro Asp Leu Leu Ser 
4980 4985 4990 4995 

GAC OOG TOC ATT GTG GCT AGC AAT CTG OGG CAG CTG GCA OGG GGC CAG 2016 
Asp Pro Ser lie Val Gly Ser Asn Leu Arg Gin Leu Ala Arg Gly Gin 
5000 5005 5010 

GOG GGC CAT GGG CTG GGC OCA GAG GAG GAC GGC TTC TOC CTG GOC AGC 2064 
Ala Gly His Gly Leu Gly Pro Glu Glu Asp Gly Phe Ser Leu Ala Ser 
5015 5020 5025 

OOC TAC TOG OCT GCC AAA TOC TTC TCA GCA TCA GAT GAA GAC CTG ATC 2112 
Pro Tyr Ser Pro Ala Lys Ser Phe Ser Ala Ser Asp Glu Asp Leu lie 
5030 5035 5040 

CAG CAG GIC CTT GCC GAG GGG GTC AGC AGC OCA GOC OCT AOC CAA GAC 2160 
Gin Gin Val Leu Ala Glu Gly Val Ser Ser Pro Ala Pro Thr Gin Asp 
5045 5050 5055 

2AOC CAC ATG GAA AOG GAC CTG CTC AGC AGC CTG TOC AGC ACT OCT GGG 2208 
Thr His Met Glu Thr Asp Leu Leu Ser Ser Leu Ser Ser Thr Pro Gly 
5060 5065 5070 . 5075 
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GAG AAG ACA GAG AOG CTG GOG CTG CAG AGG CTG GGG GAG CTG GGG OCA 2256 
Glu Lys Thr Glu Thr Leu Ala Leu Gin Arg Leu Gly Glu Leu Gly Pro 
5080 5085 5090 

8COC AGC CCA GGC CTG AAC TGG GAA CAG CCC CAG OCA GCG AGG CTG TOC 2304 
Pro Ser Pro Gly Leu Asn Trp Glu Gin Pro Gin Ala Ala Arg Leu Ser 
5095 5100 5105 

AGG ACA GGA CTG GTG GAG GGT CTG CGG AAG GGC CTG CTG COG GOC TGG 2352 
Arg Thr Gly Leu Val Glu Gly Leu Arg Lys Arg Leu Leu Pro Ala Trp 
5110 5115 5120 

TGT GOC TCC CTG GOC CAC GGG CTC AGC CTG CTC CTG GTG GOT GTG GOT 2400 
Cys Ala Ser Leu Ala His Gly Leu Ser Leu Leu Leu Val Ala Val Ala 
5125 5130 5135 

GTG GOT GTC TCA GGG TGG GTG GGT GOG AGC TIC CCC OCG GGC GTG AGT 2448 
Val Ala Val Ser Gly Trp Val Gly Ala Ser Phe Pro Pro Gly Val Ser 
5140 5145 5150 5155 

GTT GCG TGG CTC CTG TCC AGC AGC GOC AGC TIC CTG GOC TCA TTC CTC 2496 
Val Ala Trp Leu Leu Ser Ser Ser Ala Ser Phe Leu Ala Ser Phe Leu 
5160 5165 5170 

GGC TGG GAG OCA CTG AAG GTC TTG CTG GAA GOC CTG TAC TTC TCA CTG 2544 
Gly Trp Glu Pro Leu Lys Val Leu Leu Glu Ala Leu Tyr Phe Ser Leu 
5175 5180 5185 

GTG GOC AAG CGG CTG CAC COG GAT GAA GAT GAC ACC CTG GTA GAG AGC 2592 
Val Ala Lys Arg Leu His Pro Asp Glu Asp Asp Thr Leu Val Glu Ser 
5190 5195 ~ 5200 

COG GOT GTG AOG OCT GTG AGC GCA CGT GTG CCC OGC GTA CGG CCA CCC 2640 
Pro Ala Val Thr Pro Val Ser Ala Arg Val Pro Arg Val Arg Pro Pro 
5205 5210 ~ 5215 

CAC GGC TTT GCA CTC TTC CTG GOC AAG GAA GAA GOC GGC AAG GTC AAG 2688 
His Gly Phe Ala Leu Phe Leu Ala Lys Glu Glu Ala Arg Lys Val Lys 
5220 5225 5230 5235 

AGG CTA CAT GGC ATG CTG CGG AGO CTC CTG GTG TAC ATG CTT TTT CTG 2736 
Arg Leu His Gly Met Leu Arg Ser Leu Leu Val Tyr Met Leu Phe Leu 
5240 5245 5250 

CTG GTG ACC CTG CTG GOC AGC TAT GGG GAT GOC TCA TCC CAT GGG CAC 2784 
Leu Val Thr Leu Leu Ala Ser Tyr Gly Asp Ala Ser Cys His Gly His 
5255 5260 5265 

GOC TAC CGT CTG CAA AGC GOC ATC AAG CAG GAG CTG CAC AGC CGG GOC 2832 
Ala Tyr Arg Leu Gin Ser Ala lie Lys Gin Glu Leu His Ser Arg Ala 
5270 5275 5280 

TTC CTG GOC ATC AOG CGG TCT GAG GAG CTC TGG CCA TGG ATG GOC CAC 2880 
Phe Leu Ala lie Thr Arg Ser Glu Glu Leu Trp Pro Trp Met Ala His 
5285 5290 5295 

GTG CTG CTG CCC TAC GTC CAC GGG AAC CAG TCC AGC OCA GAG CIG GGG 2928 
Val Leu Leu Pro Tyr Val His Gly Asn Gin Ser Ser Pro Glu Leu Glv 
5300 5305 5310 5315 

CCC CCA CGG CTG CGG CAG GTG CGG CTG CAG GAA GCA CTC TAC CCA GAC 2976 
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OX OCA OGG CTG OGG CAG GTG OGG CTG CAG GAA GCA CTC TAC CCA GAC - 2976 
Pro Pro Arg Leu Arg Gin Val Arg Leu Gin Glu Ala Leu Tyr Pro Asp 
5320 5325 5330 

OCT OOC GGC OCC AGG CTC CAC ACG TQC TOG GOC GCA GGA GGC TTC AGC 3024 
Pro Pro Gly Pro Arg Val His Thr Cys Ser Ala Ala Gly Gly Phe Ser 
5335 5340 5345 

AOC AGC GAT TAC GAC GTT GGC TGG GAG ACT OCT CAC AAT GGC TOG GGG 3072 
Thr Ser Asp Tyr Asp Val Gly Trp Glu Ser Pro His Asn Gly Ser Gly 
5350 5355 5360 

AOG TGG GOC TAT TCA GOG O0G GAT CTG CTG GGG GCA TGG T0C TOG GGC 3120 
Thr Trp Ala Tyr Ser Ala Pro Asp Leu Leu Gly Ala Trp Ser Trp Gly 
5365 5370 5375 

T0C TGT GOC GTG TAT GAC AGC GGG GGC TAC GTG CAG GAG CTG GGC CTG 3168 
Ser Cys Ala Val Tyr Asp Ser Gly Gly Tyr Val Gin Glu Leu Gly Leu 
5380 5385 5390 5395 

AGC CTG GAG GAG AGC OGC GAC OGG CTG OGC TIC CTG CAG CTG CAC AAC 3216 
Ser Leu Glu Glu Ser Arg Asp Arg Leu Arg Phe Leu Gin Leu His Asn 
5400 " 5405 5410 

TGG CTG GAC AAC AGG AGC OGC GOT GTG TTC CTG GAG CTC AOG OGC TAC 3264 
Trp Leu Asp Asn Arg Ser Arg Ala Val Phe Leu Glu Leu Thr Arg Tyr 
5415 5420 5425 

AGC O0G GOC GTG GGG CTG CAC GOC GOC GTC AOG CTG OGC CTC GAG TIC 3312 
Ser Pro Ala Val Gly Leu His Ala Ala Val Thr Leu Arg Leu Glu Phe 
5430 5435 5440 

COG GOG GOC GGC OGC GOC CTG GOC GOC CTC AGC GTC CGC OOC TTT GOG 3360 
Pro Ala Ala Gly Arg Ala Leu Ala Ala Leu Ser Val Arg Pro Phe Ala 
5445 5450 5455 

CTG OGC OGC CTC AGC GOG GGC CTC TOG CTG OCT CTG CTC AOC TOG GTG 3408 
Leu Arg Arg Leu Ser Ala Gly Leu Ser Leu Pro Leu Leu Thr Ser Val 
5460 5465 5470 5475 

TQC CTG CTG CTG TIC GOC CTG CAC TIC GOC CTG GOC GAG GOC OCT ACT 3456 
Cys Leu Leu Leu Phe Ala Val His Phe Ala Val Ala Glu Ala Arg Thr 
5480 5485 5490 

TGG CAC AGG GAA GGG OGC TGG OGC GTG CTG OGG CTC GGA GOC TGG GOG 3504 
Trp His Arg Glu Gly Arg Trp Arg Val Leu Arg Leu Gly Ala Trp Ala 
5495 5500 5505 

OGG TGG CTG CTG GTG GOG CTG AOG GOG GOC AOG GCA CTG GTA OGC CTC 3552 
Arg Trp Leu Leu Val Ala Leu Thr Ala Ala Thr Ala Leu Val Arg Leu 
5510 5515 5520 

GOC CAG CTG GGT GOC GCT GAC OGC CAG TGG AOC OGT TIC GTG OGC GGC 3600 
Ala Gin Leu Gly Ala Ala Asp Arg Gin Trp Thr Arg Phe Val Arg Gly 
5525 5530 5535 

OGC COG OGC OGC TTC ACT AGC TTC GAC CAG GTG GOG CAC GTG AGC TOC 3648 
Arg Pro Arg Arg Phe Thr Ser Phe Asp Gin Val Ala His Val Ser Ser 
5540 5545 5550 5555 
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OCA GOC CGT GGC CIG GOG GOC TOG CTG CIC TIC CTG CTT TIG CTC AAG 3696 
Ala Ala Arg Gly Leu Ala Ala Ser Leu Leu Phe Leu Leu Leu Val Lys 
5560 5565 5570 

2GCT GOC CAG CAC GTA 090 TIC GIG OGC CAG TOG TOO GIC TIT GGC AAG 3744 
Ala Ala Gin His Val Arg Phe Val Arg Gin Trp Ser Val Phe Gly Lys 
5575 5580 5585 

ACA TTA TOO OGA GOT CTG OCA GAG CIC CIG GGG GIC AOC TIG GGC CIG 3792 
Tfrur Leu Cys Arg Ala Leu Pro Glu Leu Leu Gly Val Thr Leu Gly Leu 
5590 5595 5600 

GIG GIG CIC GGG GTA GOC TAG GOC CAG CIG GOC ATC CIG CIC GIG TCP 3840 
Val Val Leu Gly Val Ala Tyr Ala Gin Leu Ala lie Leu Leu Val Ser 
5605 5610 5615 

TOC TGT GIG GAC TOC CIC TGG AGC GIG GOC CAG GOC CIG TIG GIG CTG 3888 
Ser Cys Val Asp Ser Leu Trp Ser Val Ala Gin Ala Leu Leu Val Leu 
5620 5625 5630 5635 

TGC OCT GGG ACT GGG CIC TCT AOC CTG TGT OCT GOC GAG TOC TOG CAC 3936 
Cys Pro Gly Thr Gly Leu Ser Thr Leu Cys Pro Ala Glu Ser Trp His 
5640 5645 5650 

CTG TCA COC CIG CIG TGT GIG GGG CIC TGG GCA CIG GOG CIG TGG GGC 3984 
Leu Ser Pro Leu Leu Cys Val Gly Leu Trp Ala Leu Arg Leu Trp Gly 
5655 5660 5665 

GOC CPA GGG CIG GGG GOT GIT ATT CIC OGC TGG CGC TAG CAC GOC TIG 4032 
Ala Leu Arg Leu Gly Ala Val lie Leu Arg Trp Arg Tyr His Ala Leu 
5670 5675 5680 

OGT GGA GAG CIG TAC GGG OOG GOC TGG GAG COC CAG GAC TAG GAG ATG 4080 
Arg Gly Glu Leu Tyr Arg Pro Ala Trp Glu Pro Gin Asp Tyr Glu Met: 
5685 5690 5695 

GIG GAG TIG TIC CIG CGC AGG CTG CGC CIC TGG ATG GGC CIC AGC AAG 4128 
Val Glu Leu Hie Leu Arg Arg Leu Arg Leu Trp Met Gly Leu Ser Lys 
5700 5705 ~ 5710 5715 

GIC AAG GAG TIC OGC CAC AAA GIC OGC TIT GAA GGG ATG GAG OOG CIG 4176 
Val Lys Glu Phe Arg His Lys Val Arg Phe Glu Gly Met Glu Pro Leu 
5720 5725 5730 

COC TCT CGC TOC TOC AGG GGC TOC AAG GTA TOC OOG GAT GIG OOC OCA 4224 
Pro Ser Arg Ser Ser Arg Gly Ser Lys Val Ser Pro Asp Val Pro Pro 
5735 5740 5745 

OOC AGC GOT OGC TOC GAT GOC TOG CAC OOC TCC AOC TOC TOC AGC CAG 4272 
Pro Ser Ala Gly Ser Asp Ala Ser His Pro Ser Thr Ser Ser Ser Gin 
5750 5755 5760 

CIG GAT GGG CIG AGC GIG AGC CTG GGC OGG CIG GGG ACA AGG TGT GAG 4320 
Leu Asp Gly Leu Ser Val Ser Leu Gly Arg Leu Gly Thr Arg Cys Glu 
5765 5770 5775 



OCT GAG OOC TOC OGC CIC CAA -GOC GIG TIC GAG GOC CIG CTC ACC CAG 
Pro Glu Pro Ser Arg Leu Gin Ala Val Phe Glu Ala Leu Leu Thr Gin 
5780 5785 5790 5795 



4368 
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ITT GAC OGA CTC AAC CAG GOC ACA GAG GAC GTC TAG CAG CTG GAG CAG 4416 
Pte Asp Arg Leu Asn Gin Ala Thr Glu Asp Val Tyr Gin Leu Glu Gin 
5800 5805 5810 

CAG CTG CAC AGC CTG CAA GGC OGC AGG AGC AGC OGG GOG OOC GCC GGA 4464 
Gin Leu His Ser Leu Gin Gly Arg Arg Ser Ser Arg Ala Pro Ala Gly 
5815 5820 5825 

TCT TOC OCT GGC OCA TOC OOG GGC CTG OGG OCA GCA CTG OOC AGC CGC 4512 
Ser Ser Arg Gly Pro Ser Pro Gly Leu Arg Pro Ala Leu Pro Ser Arg 
5830 5835 ~ 5840 

CTT GOC OGG GOC ACT OGG GGT CTG GAC CTG GOC ACT GGC OOC AGC AGG 4560 
Leu Ala Arg Ala Ser Arg Gly Val Asp Leu Ala Thr Gly Pro Ser Arg 
5845 5850 5855 

ACA OCT TOG GGC CAA GAA CAA GGT OCA OOC CAG CAG CAC TTA GTC CTC 4608 
Thr Pro Ser Gly Gin Glu Gin Gly Pro Pro Gin Gin His Leu Val Leu 
5860 5865 5870 5875 

CTT OCT GGC GOG GGT GGG COG TGG ACT OGG ACT GGA CAC OGC TCA GTA 4656 
Leu Pro Gly Gly Gly Gly Pro Trp Ser Arg Ser Gly His Arg Ser Val 
5880 * 5885 5890 

TTA CTT TCT GOC GCT GTC AAG GOC GAG GGC CAG GCA GAA TGG CTG CAC 4704 
Leu Leu Ser Ala Ala Val Lys Ala Glu Gly Gin Ala Glu Trp Leu His 
5895 5900 5905 

GTA GGT TOC OCA GAG AGC AGG CAG GGG CAT CTG TCT GTC TCT GGG CTT 4752 
Val Gly Ser Pro Glu Ser Arg Gin Gly His Leu Ser Val Cys Gly Leu 
5910 5915 5920 

CAG CAC TTT AAA GAG GCT GIG TGG OCA ADC AGG AOC CAG GGT OOC CTC 4800 
Gin His Phe Lys Glu Ala Val Trp Pro Thr Arg Thr Gin Gly Pro Leu 
5925 5930 5935 

OOC AGC TOC CTT GGG AAG GAC ACA GCA CTA TIG GAC GCT TIC 4842 
Pro Ser Ser Leu Gly Lys Asp Thr Ala Val Leu Asp Gly Phe 
5940 5945 5950 

TAGOCTCTGA GATGCTAATT TATIT0000G ACTOCICAGG TACAGOGGGC TCTG000GGC 4902 



O0CAO000CT GGGCAGATCT OCOOCACTGC TAAGGCTGCT GGCTICAGGG AGGGTTAGOC 4962 
2TGCAO0GO0G OCAOOCTGOC CCTAAGTTAT TAOCTCTOCA G T1 ULTAOOG TACTOOCTGC 5022 





AOOGTCTCAC 


TGICTGTCTC 


GTGTCACTAA 


TTTATATGCT 


GITAAAATGT 


CTATATTTTT 


5082 




CTATGTCACT 


ATTTICACTA 


GGGCTGAGGG 


GOCTGOG00C 


AGAGCTGGOC 


T0000CAACA 


5142 




OCTGCTGOGC 


TTGGTAGGTG 


TGGTGGOGTT 


ATGGCAGOCC 


GGCTQCTGCT 


TGGATGOGAG 


5202 




CITGGOCTTG 


GGCOGGTGCT 


GGGGGCACAG 


CTGTPCTGOCA 


GGCACTCTCA 


TCAOOCCAGA 


5262 




GGOCTIGTCA 


TULTUULT1G 


O00CAGG0CA 


GCTAGCAAGA 


GAGCAGOGOC 


CAGGOCTGCT 


5322 




GGCATCAGGT 


CTGGGTAACT 


AGCAGGACTA 


GGCATGTCAG 


AGGA000CAG 


GGTGGTTAGA 


5382 




GGAAAAGACT 


OCTOCTQGGG 


GCTGGCTOOC 


AGGGTGGAGG 


AAGGPGACTG 


TGTGTGrGTG 


5442 




TGICTGOGOG 


OG0GAOG0GC 


GAGICTGCTG 


TATGGOOCAG 


GCAGOCTCAA 


QGCOCTCoGA 


5502 
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GCTGGCTGTG OCTQCTICTG TGTAOCACTT CTGTGGGCAT GGOOGCTTCT AGAGOCJTCGA 5562 
CACCCCCCCA ACOOOOGCAC CAAGCAGACA AACTCAATAA AAGAGCTGTC TGACTGCAAA 5622 
AAAAAAAAA 5631 



(xi) SEQUENCE DESCRIPTION: SBQ ID NO: 4: (Ccnpare Figure 7) 

Leu Asn Glu Glu Pro Leu Thr Leu Ala Gly Glu Glu lie Val Ala Gin 
15 10 15 

Gly Lys Arg Ser Asp Pro Arg Ser Leu Leu Cys Tyr Gly Gly Ala Pro 
20 25 30 

Gly Pro Gly Cys His Phe Ser lie Pro Glu Ala Phe Ser Gly Ala Leu 
35 40 45 

Ala Asn Leu Ser Asp Val Val Gin Leu lie Phe Leu Val Asp Ser Asn 
50 55 60 

Pro Phe Pro Phe Gly Tyr lie Ser Asn Tyr Thr Val Ser Thr Lys Val 
65 70 75 80 

Ala Ser Met Ala Hie Gin Thr Gin Ala Gly Ala Gin lie Pro lie Glu 
85 90 95 

Arg Leu Ala Ser Glu Arg Ala lie Thr Val Lys Val Pro Asn Asn Ser 
100 105 110 

Asp Trp Ala Ala Arg Gly His Arg Ser Ser Ala Asn Ser Ala Asn Ser 
115 120 125 

Val Val Val Gin Pro Gin Ala Ser Val Gly Ala Val Val Thr Leu Asp 
130 135 ~ 140 

Ser Ser Asn Pro Ala Ala Gly Leu His Leu Gin Leu Asn Tyr Thr Leu 
145 150 155 160 

leu Asp Gly His Tyr Leu Ser Glu Glu Pro Glu Pro Tyr Leu Ala Val 
165 170 175 

Tyr Leu His Ser Glu Pro Arg Pro Asn Glu His Asn Cys Ser Ala Ser 
180 185 190 

Arg Azg lie Arg Pro Glu Ser Leu Gin Gly Ala Asp His Arg Pro Tyr 
195 • 200 205 

Thr Phe Phe lie Ser Pro Gly Ser Arg Asp Pro Ala Gly Ser Tyr His 
210 215 220 

Leu Asn Leu Ser Ser His Phe Arg Trp Ser Ala Leu Gin Val Ser Val 
225 230 235 240 

Gly Leu Tyr Thr Ser Leu Cys Gin Tyr Phe Ser Glu Glu Asp Met Val 
245 250 255 
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Trp Arg Thr Glu Gly Leu Leu Pro Leu Glu Glu Thr Ser Pro Arg Gin 
260 265 270 

Ala Val Cys Leu Thr Arg His Leu Thr Ala Phe Gly Ala Ser Leu Phe 
275 280 285 

Val Pro Pro Ser His Val Arg Phe Val Phe Pro Glu Pro Thr Ala Asp 
290 295 300 

Val Asn Tyr lie Val Met Leu Thr Cys Ala Val Cys Leu Val Thr Tyr 
305 310 315 320 

Met Val Met Ala Ala lie Leu His Lys Leu Asp Gin Leu Asp Ala Ser 
325 330 335 

Arg Gly Arg Ala lie Pro Phe Cys Gly Gin Arg Gly Arg Phe Lys Tyr 
340 345 350 

Glu lie Leu Val Lys Thr Gly Trp Gly Arg Gly Ser Gly Thr Thr Ala 
355 360 365 

His Val Gly lie Met Leu Tyr Gly Val Asp Ser Arg Ser Gly His Arg 
370 375 380 

His Leu Asp Gly Asp Arg Ala Phe His Arg Asn Ser Leu Asp lie Phe 
385 390 395 400 

Arg lie Ala Thr Pro His Ser Leu Gly Ser Val Trp Lys lie Arg Val 
405 410 415 

Trp His Asp Asn Lys Gly Leu Ser Pro Ala Trp Hie Leu Gin His Val 
420 425 430 

lie Val Arg Asp Leu Gin Thr Ala Arg Ser Ala Phe Phe Leu Val Asn 
435 440 445 

Asp Trp Leu Ser Val Glu Thr Glu Ala Asn Gly Gly Leu Val Glu Lys 
450 455 460 

Glu Val Leu Ala Ala Ser Asp Ala Ala Leu Leu Arg Phe Arg Arg Leu 
465 470 475 480 

Leu Val Ala Glu Leu Gin Arg Gly Phe Phe Asp Lys His lie Trp Leu 
485 490 495 

Ser He Trp Asp Arg Pro Pro Arg Ser Arg Phe Thr Arg He Gin Arg 
5CO ~ 505 510 

Ala Thr Cys Cys Val Leu Leu He Cys Leu Phe Leu Gly Ala Asn Ala 
515 520 525 

Val Trp Tyr Gly Ala Val Gly Asp Ser Ala Tyr Ser Thr Gly His Val 
530 535 540 

Ser Arg Leu Ser Pro Leu Ser Val Asp Thr Val Ala Val Gly Leu Val 
545 ~ 550 555 560 

Ser Ser Val Val Val Tyr Pro Val Tyr Leu Ala He Leu Phe Leu Phe 
565 570 575 
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Arg Met Ser Arg Ser Lys Val Ala Gly Ser Pro Ser Pro Thr Pro Ala 
580 585 590 

Gly Gin Gin Val Leu Asp lie Asp Ser Cys Leu Asp Ser Ser Val Leu 
595 600 605 

Asp Ser Ser Phe Leu Thr Phe Ser Gly Leu His Ala Glu Ala Phe Val 
610 615 ^ 620 

Gly Gin Met Lys Ser Asp Leu Pte Leu Asp Asp Ser Lys Ser Leu Val 
625 630 635 640 

Cys Trp Pro Ser Gly Glu Gly Thr Leu Ser Trp Pro Asp Leu Leu Ser 
645 650 655 

Asp Pro Ser lie Val Gly Ser Asn Leu Arg Gin Leu Ala Arg Gly Gin 
660 665 670 

Ala Gly His Gly Leu Gly Pro Glu Glu Asp Gly Phe Ser Leu Ala Ser 
675 680 685 

Pro Tyr Ser Pro Ala Lys Ser Phe Ser Ala Ser Asp Glu Asp Leu lie 
690 695 700 

Gin Gin Val Leu Ala Glu Gly Val Ser Ser Pro Ala Pro Thr Gin Asp 
705 710 715 720 

Thr His Met Glu Thr Asp Leu Leu Ser Ser Leu Ser Ser Thr Pro Gly 
725 730 735 

Glu Lys Otxr Glu Thr Leu Ala Leu Gin Arg leu Gly Glu Leu Gly Pro 
740 745 750 

Pro Ser Pro Gly Leu Asn Trp Glu Gin Pro Gin Ala Ala Arg Leu Ser 
755 760 765 

Arg Thr Gly Lai Val Glu Gly Leu Arg lys Arg Leu Leu Pro Ala Trp 
770 775 780 

Cys Ala Ser Leu Ala His Gly Leu Ser Leu Leu Lai Val Ala Val Ala 
785 790 795 800 

Val Ala Val Ser Gly Trp Val Gly Ala Ser Phe Pro Pro Gly Val Ser 
805 810 815 

Val Ala Trp Leu Leu Ser Ser Ser Ala Ser Phe Leu Ala Ser Phe Leu 
820 825 830 

Gly Trp Glu Pro Leu Lys Val Leu Leu Glu Ala Leu Tyr Phe Ser Leu 
835 840 845 

Val Ala Lys Arg Leu His Pro Asp Glu Asp Asp Thr Leu Val Glu Ser 
850 855 860 

Pro Ala Val Thr Pro Val Ser Ala Arg Val Pro Aig Val Ara Pro Pro 
865 870 ^ 875 880 

His Gly Phe Ala Leu Phe Leu Ala Lys Glu Glu Ala Arg Lys Val Lvs 
885 890 895 
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Arg Leu His Gly Met Leu Arg Ser Leu Leu Val Tyr Met Leu Phe Leu 
900 ~ 905 910 

Leu Val Thr Leu Leu Ala Ser Tyr Gly Asp Ala Ser Cys His Gly His 
915 920 925 

Ala Tyr Arg Leu Gin Ser Ala lie Lys Gin Glu Leu His Ser Arg Ala 
930 935 940 

Pbe Leu Ala lie Thr Arg Ser Glu Glu Leu Trp Pro Trp Met Ala His 
945 950 955 960 

Val Leu Leu Pro Tyr Val His Gly Asn Gin Ser Ser Pro Glu Leu Gly 
965 970 975 

Pro Pro Arg Leu Arg Gin Val Arg Leu Gin Glu Ala Leu Tyr Pro Asp 
980 ^ 985 990 

Pro Pro Gly Pro Arg Val His Thr Cys Ser Ala Ala Gly Gly Phe Ser 
995 1000 1005 

Thr Ser Asp Tyr Asp Val Gly Trp Glu Ser Pro His Asn Gly Ser Gly 
1010 1015 1020 

Thr Trp Ala Tyr Ser Ala Pro Asp Leu Leu Gly Ala Trp Ser Trp Gly 
1025 1030 1035 1040 

Ser Cys Ala Val Tyr Asp Ser Gly Gly Tyr Val Gin Glu Leu Gly Leu 
1045 1050 1055 

2 

Ser Leu Glu Glu Ser Arg Asp Arg Leu Arg Phe Leu Gin Leu His Asn 
1060 1065 1070 

Trp leu Asp Asn Arg Ser Arg Ala Val Phe Leu Glu Leu Thr Arg Tyr 
1075 ~ 1080 1085 

Ser Pro Ala Val Gly Leu His Ala Ala Val Thr Leu Arg Leu Glu Phe 
1090 1095 1100 

Pro Ala Ala Gly Arg Ala Leu Ala Ala Leu Ser Val Arg Pro Phe Ala 
1105 " 1110 1115 1120 

Leu Arg Arg Leu Ser Ala Gly Leu Ser Leu Pro Leu Leu Thr Ser Val 
1125 1130 1135 

Cys Leu Leu Leu Phe Ala Val His Phe Ala Val Ala Glu Ala Arg Thr 
1140 1145 1150 

Trp His Arg Glu Gly Arg Trp Arg Val Leu Arg Leu Gly Ala Trp Ala 
1155 " 1160 1165 

Artr Trp Leu Leu Val Ala Leu Thr Ala Ala Thr Ala Leu Val Arg Leu 
1170 1175 1180 

8 

Ala Gin Leu Gly Ala Ala Asp Arg Gin Trp Thr Arg Hie Val Arg Gly 
1185 1190 1195 1200 

Arg Pro Arg Arg Phe Thr Ser Phe Asp Gin Val Ala His Val Ser Ser 
2 1205 1210 1215 
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Ala Ala Arg Gly Leu Ala Ala Ser Leu Leu Phe Leu Leu Leu Val Lys 
1220 1225 1230 

Ala Ala Gin His Val Arg Phe Val Arg Gin Trp Ser Val Phe Gly Lys 
1235 1240 1245 

Thr Leu Cys Arg Ala Leu Pro Glu Leu Leu Gly Val Thr Leu Gly Leu 
1250 1255 1260 

Val Val Leu Gly Val Ala Tyr Ala Gin Leu Ala lie Leu Leu Val Ser 
1265 1270 1275 1280 

Ser Cys Val Asp Ser Leu Trp Ser Val Ala Gin Ala Leu Leu Val Leu 
1285 1290 1295 

Cys Pro Gly Thr Gly Leu Ser Thr Leu Cys Pro Ala Glu Ser Trp His 
1300 1305 1310 

Leu Ser Pro Leu Leu Cys Val Gly Leu Trp Ala Leu Arg Leu Trp Gly 
1315 ~ 1320 1325 

Ala Leu Arg Leu Gly Ala Val lie Leu Arg Trp Arg Tyr His Ala Leu 
1330 1335 1340 

Arg Gly Glu Leu Tyr Arg Pro Ala Trp Glu Pro Gin Asp Tyr Glu Met 
1345 1350 1355 1360 

Val Glu Leu Phe Leu Arg Arg Leu Arg Leu Trp Met Gly Leu Ser Lys 
1365 1370 1375 

Val Lys Glu Hie Arg His Lys Val Arg Phe Glu Gly Met Glu Pro Leu 
1380 1385 1390 

Pro Ser Arg Ser Ser Arg Gly Ser Lys Val Ser Pro Asp Val Pro Pro 
1395 1400 1405 

Pro Ser Ala Gly Ser Asp Ala Ser His Pro Ser Thr Ser Ser Ser Gin 
1410 1415 1420 

Leu Asp Gly Leu Ser Val Ser Leu Gly Arg Leu Gly Thr Arg Cys Glu 
1425 1430 1435 1440 

Pro Glu Pro Ser Arg Leu Gin Ala Val Phe Glu Ala Leu Leu Thr Gin 
1445 1450 1455 

Phe Asp Arg Leu Asn Gin Ala Thr Glu Asp Val Tyr Gin Leu Glu Gin 
1460 1465 1470 

Gin Leu His Ser Leu Gin Gly Arg Arg Ser Ser Arg Ala Pro Ala Gly 
1475 1480 1485 

Ser Ser Arg Gly Pro Ser Pro Gly Leu Arg Pro Ala Leu Pro Ser Ara 
1490 1495 1500 

Leu Ala Arg Ala Ser Arg Gly Val Asp Leu Ala Thr Gly Pro Ser Ara 
1505 1510 1515 ^ 1520 

Thr Pro Ser Gly Gin Glu Gin Gly Pro Pro Gin Gin His Leu Val Leu 
1525 1530 1535 
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Leu Pro Gly Gly Gly Gly Pro Trp Ser Ang Ser Gly His Anq Ser Val 
1540 1545 1550 

Leu Leu Ser Ala Ala Val Lys Ala Glu Gly Gin Ala Glu Trp Leu His 
1555 1560 1565 

Val Gly Ser Pro Glu Ser Arg Gin Gly His Leu Ser Val Cys Gly Leu 
1570 1575 1580 

Gin His Phe Lys Glu Ala Val Trp Pro Thr Arg Thar Gin Gly Pro Leu 
1585 1590 1595 16CX 

Pro Ser Ser Leu Gly Lys Asp Thr Ala Val Leu Asp Gly Phe 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: (Compare Figure 8) 

A U L T1U QCAC CATCAAGGGC CAGTITC^ACr TTGTOCAOGT GAT0CTCAOC C0GCTGGACT 60 

AGGACTGCAA OCTOCTGTTOC CTGCACTGCA GGAAAGACAT GGAGGGOCTT CTGGACAOCA 120 

GOGrTGQCCAA GAT0GTGTCT GACOGCAAOC TGOCCTTOCT GGOCOG0CAG AT GG00CTQC 180 

ACGCAAATAT GGOCTCACAG GrTGCATCATA GGCQCTOCAA O0OCAO0GAT ATCTACOOCT 240 

CCAAGTGGAT TGCOOGGCTC CGOCACATCA AG0GGCT00G CCAGOGGATC TQCGAGGAAG 300 

COGOCTACTC CAACOOCAGC CTAOCTCTGG TGCA00CT0C GTOOCATAGC AAAGOOOCTG 360 

CACAGACTCC AQ00GAG00C ACACCTGQCT ATGAGGTGGG CCAGOGGAAG OGOCTCATCT 420 

CCTOGGTGGA GGACTTCACC GAGTTTCTCT GAGGOOGGGG OCCT00CT0C TGCACTGGOC 480 

TTGGAOGGTA TTOCCTGICA GTTGAAATAAA TAAAGTOCTG AOOOCAGIGC ACAGACATAG 540 

AGGCACAGAT TGC 553 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 6: (Compare Figure 9) 

CTQCTGTCTG TGAGACCTGC GQGGCTGGGA AGTC7ITGGCA GAG00G0GAG TAOXTTCCTC 60 

ACTOCTTITG TTCTTITGAC GTAAGCTG3C GAGTGGCACT GCCTGAGTIC CGCTCAGTGC 120 

O0G0CCTGAT GTG0GGAO0C CGCTGCATTC TTGCTGTTAG GTOGTGGOGG TGIGCOCiXSV 180 

- OGCTGGTGGG CACOGAGAGT CITTOGGAGC TTTGGGGAGG TTCTGOCAAG OCTGAGOCTC 240 

GAOGTCOOCC TICOOGGCTT TCTC7TTGGCT CTTCTGAGGC CAGGQCATCT CTATGAGGGC 300 

CTOCTGCTGG AGOOGTTCTCT GTGGATCTOC TCTGOCATCC TGGCOCATGA CTGGG7PGATG 360 

CGCTGGOCAC CATCTGGTGA CAGTGGOOGG GCAOOGCTGC CAAATCTGGG TOOOGCATCT 420 

GCAAG000CT CCCTOGCTCC CCTAGGCTAT GGGOTGGTTC TGOCACTGCr CT0GCT000C 480 

CAOCTTGGGf? TQCCTCTOOC CCTGCTOGTG GGGGAGA 517 
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